Compare commits

...

1737 Commits
rm ... 1.21.5

Author SHA1 Message Date
sabaimran
7216a06f5f Release Khoj version 1.21.5 2024-09-03 21:58:00 -07:00
sabaimran
895f1c8e9e Gracefully close thread when there's an exception in the anthropic llm thread. Include full stack traces. 2024-09-03 13:16:51 -07:00
sabaimran
17901406aa Gracefully close thread when there's an exception in the openai llm thread. Closes #894. 2024-09-03 13:16:51 -07:00
sabaimran
6ed68b574b Merge pull request #898 from lvnilesh/patch-1
Handles deprecation of version reference
2024-09-03 12:53:44 -07:00
sabaimran
912cc0074a Use nonlocal for conversation_id when running the event_generator 2024-09-03 11:55:06 -07:00
sabaimran
591f5a522c Release Khoj version 1.21.4 2024-09-02 17:45:39 -07:00
sabaimran
9306a0bb2c Prefetch the settings and openai_config of a texttoimagemodelconfig 2024-09-02 17:35:21 -07:00
sabaimran
132eac0f51 Merge pull request #897 from khoj-ai/features/increase-rate-limits
Increase rate limits for data indexing
2024-08-25 23:39:30 -07:00
LV Nilesh
77cc1cd42f Update docker-compose.yml
Handles deprecation of version reference
2024-08-25 17:05:47 -07:00
sabaimran
977001b801 Reduce the test data packet size 2024-08-25 16:14:32 -07:00
sabaimran
6eb06e8626 Downgrade rate limit to 200MB 2024-08-25 15:26:27 -07:00
sabaimran
439a2680fd Increase rate limits for data indexing 2024-08-25 15:09:30 -07:00
sabaimran
af4e9988c4 Merge pull request #896 from khoj-ai/features/add-support-for-custom-confidence
Add support for custom search model-specific thresholds
2024-08-24 20:32:41 -07:00
sabaimran
4b77325f63 Default to infinite distance when using the search API 2024-08-24 19:57:49 -07:00
sabaimran
e919d28f1c Add support for custom search model-specific thresholds 2024-08-24 19:28:26 -07:00
sabaimran
fa4d808a5f Encode uri components when sending automations data to the server 2024-08-24 18:45:50 -07:00
sabaimran
387b7c7887 Release Khoj version 1.21.3 2024-08-23 11:15:15 -07:00
sabaimran
7b8b3a66ae Revert django version to previous patch 2024-08-23 11:12:41 -07:00
Debanjum Singh Solanky
5927ca8032 Properly close chat stream iterator even if response generation fails
Previously chat stream iterator wasn't closed when response streaming
for offline chat model threw an exception.

This would require restarting the application. Now application doesn't
hang even if current response generation fails with exception
2024-08-23 02:06:26 -07:00
Debanjum Singh Solanky
bdb81260ac Update docs to mention using Llama 3.1 and 20K max prompt size for it
Update stale credits to better reflect bigger open source dependencies
2024-08-22 20:27:58 -07:00
Debanjum Singh Solanky
238bc11a50 Fix, improve openai chat actor, director tests & online search prompt 2024-08-22 19:09:33 -07:00
Debanjum Singh Solanky
9986c183ea Default to gpt-4o-mini instead of gpt-3.5-turbo in tests, func args
GPT-4o-mini is cheaper, smarter and can hold more context than
GPT-3.5-turbo. In production, we also default to gpt-4o-mini, so makes
sense to upgrade defaults and tests to work with it
2024-08-22 19:04:49 -07:00
Debanjum Singh Solanky
8a4c20d59a Enforce json response by offline models when requested by chat actors
- Background
  Llama.cpp allows enforcing response as json object similar to OpenAI
  API. Pass expected response format to offline chat models as well.

- Overview
  Enforce json output to improve intermediate step performance by
  offline chat models. This is especially helpful when working with
  smaller models like Phi-3.5-mini and Gemma-2 2B, that do not
  consistently respond with structured output, even when requested

- Details
  Enforce json response by extract questions, infer output offline
  chat actors
  - Convert prompts to output json objects when offline chat models
    extract document search questions or infer output mode
  - Make llama.cpp enforce response as json object

- Result
  - Improve all intermediate steps by offline chat actors via json
    response enforcement
  - Avoid the manual, ad-hoc and flaky output schema enforcement and
    simplify the code
2024-08-22 18:07:44 -07:00
Debanjum Singh Solanky
ab7fb5117c Release Khoj version 1.21.2 2024-08-20 12:38:54 -07:00
Debanjum Singh Solanky
de24ffcf0d Upgrade Axios, a desktop app dependency, to version 1.7.4 2024-08-20 12:32:36 -07:00
Debanjum Singh Solanky
a60baa55fb Upgrade Django, a Khoj server dependency, to version 5.0.8 2024-08-20 12:32:00 -07:00
sabaimran
1ac8de6c3a Release Khoj version 1.21.1 2024-08-20 11:55:34 -07:00
Debanjum Singh Solanky
5d59acd1f4 Stop pushing deprecated khoj-assistant package to pypi
- Also skip uploading package version to it already exists on pypi
  This happens when a release is new khoj tagged release is created
2024-08-20 11:43:02 -07:00
sabaimran
f6ce2fd432 Handle end of chunk logic in openai stream processor 2024-08-20 10:50:09 -07:00
sabaimran
029775420c Release Khoj version 1.21.0 2024-08-20 10:01:56 -07:00
sabaimran
4808ce778a Merge pull request #892 from khoj-ai/upgrade-offline-chat-models-support
Upgrade offline chat model support. Default to Llama 3.1
2024-08-20 11:51:20 -05:00
Debanjum Singh Solanky
58c8068079 Upgrade default offline chat model to llama 3.1 2024-08-20 09:28:56 -07:00
sabaimran
2d9dd81e76 Re-add authenticated decorator to api_chat.py /chat endpoint 2024-08-19 05:37:18 -05:00
sabaimran
2c5350329a Remove the hashes from titles in found relevant notes 2024-08-18 22:31:15 -05:00
Debanjum Singh Solanky
acdc3f9470 Unwrap any json in md code block, when parsing chat actor responses
This is a more robust way to extract json output requested from
gemma-2 (2B, 9B) models which tend to return json in md codeblocks.

Other models should remain unaffected by this change.

Also removed request to not wrap json in codeblocks from prompts. As
code is doing the unwrapping automatically now, when present
2024-08-16 14:16:29 -05:00
Debanjum Singh Solanky
ca45fce8ac Break long links in train of thought to stay within chat page width 2024-08-16 14:16:29 -05:00
sabaimran
c0316a6b5d Enable free tier users to have unlimited chats with the default chat model (#886)
- Allow free tier users to have unlimited chats with default chat model. It'll only be rate-limited and at the same rate as subscribed users
- In the server chat settings, replace the concept of default/summarizer models with default/advanced chat models. Use the advanced models as a default for subscribed users.
- For each `ChatModelOption' configuration, allow the admin to specify a separate value of `max_tokens' for subscribed users. This allows server admins to configure different max token limits for unsubscribed and subscribed users
- Show error message in web app when hit rate limit or other server errors
2024-08-16 12:14:44 -07:00
Debanjum
8dad9362e7 Improve search model config display for admin (#887) from aam-at/feature/improve_search_model_config_admin
Currently, the search model config display for admins only shows the id of the search model config, which is not very informative. 

The changes enhances the admin console by displaying the name of the search model config (name), as well as the bi-encoder model (bi_encoder) and cross-encoder model (cross_encoder) along the id.
2024-08-16 07:33:55 -07:00
Debanjum
2b1482d2b4 Fix indexing content from Emacs #883 from aam-at/bugfix/fix_emacs_if
Previously `force' was passed as a query param to the single indexing API. After the recent API updates, it is meant to select the API method to use (PATCH vs PATCH). Converting `force' argument to a bool fixes implementing this new behavior
2024-08-16 07:32:46 -07:00
Debanjum
0b568e204e Add model_config for cross-encoder model (#885) from aam-at/feature/crossencoder_model_config
Add `model_config' for the cross-encoder model, so the server admin can use models which require the `trust_remote_code' argument to run locally
2024-08-16 07:32:19 -07:00
Debanjum
39e566ba91 Improve Document, Online Search to Answer Vague or Meta Questions (#870)
- Major
  - Improve doc search actor performance on vague, random or meta questions
  - Pass user's name to document and online search actors prompts

- Minor
  - Fix and improve openai chat actor tests
  - Remove unused max tokns arg to extract qs func of doc search actor
2024-08-16 06:46:13 -07:00
Debanjum Singh Solanky
27ad9b1302 Remove unused max tokns arg to extract qs func of doc search actor 2024-08-13 12:53:39 +05:30
Debanjum Singh Solanky
f75606d7f5 Improve doc search actor performance on vague, random or meta questions
- Issue
  Previously the doc search actor wouldn't extract good search queries
  to run on user's documents for broad, vague questions.

- Fix
  The updated extract questions prompt shows and tells the doc search
  actor on how to deal with such questions

  The doc search actor's temperature was also increased to support more
  creative/random questions. The previous temp of 0 was meant to
  encourage structured json output. But now with json mode, a low temp is
  not necessary to get json output
2024-08-13 12:53:39 +05:30
Debanjum Singh Solanky
3675938df6 Support passing temperature to offline chat model chat actors
- Use temperature of 0 by default for extract questions offline chat
  actor
- Use temperature of 0.2 for send_message_to_model_offline (this is
  the default temperature set by llama.cpp)
2024-08-13 12:53:00 +05:30
Shantanu Sakpal
b5bcce7f85 Cycle through chat history in chat input on Obsidian (#861)
* Add ability to cycle through the chat history in the chat input on Obsidian (similar to terminal history navigation)
* Add mod key shortcut to cycle through chat history in chat input
* Add shortcut help text in chat input placeholder

---------

Co-authored-by: Debanjum Singh Solanky <debanjum@gmail.com>
2024-08-12 23:55:25 -07:00
srikary12
05c0aa3882 Support exclusion file filters (#826)
### Overview
Support exclude file filter in user search queries

### Details
- All of the exclude file filter terms need to be satisfied
- Any one of the include file filter terms should be satisfied

### Example
- **Search Query**: *what happened yesterday? -file:"tasks.org" -file:"work.md" file:"diary.org" file:"journal.org*
- **Behavior**: Query will try find relevant notes in any of `journal.org` or `diary.org` and not in `tasks.org` and not in `work.md`

### Details
* Add support for exclusion file filters
* Translate file filter to valid Django DB entry filter regex
* Exclude all files when multiple exclude file filter in query

Previously we were applying an "Or" filter, which would exclude any
file mentioned in a query with multiple exclude file filter.

This is not what we naturally mean when we ask excluding a file in a query

* Rename, rearrange, deduplicate and add file filter tests

Closes #728
---------

Co-authored-by: Debanjum Singh Solanky <debanjum@gmail.com>
2024-08-12 05:41:54 -07:00
Alexander Matyasko
2d9bf14ecb Improve search model config display for admin 2024-08-11 19:13:25 +08:00
Debanjum Singh Solanky
7815e02dd4 Release Khoj version 1.20.4 2024-08-11 16:00:13 +05:30
Debanjum Singh Solanky
d951e36945 Update khoj.el package description, it had gone stale 2024-08-11 15:52:46 +05:30
Debanjum Singh Solanky
16b31c3e35 Refresh automation data shown by edit automation card after update
Previously required the automation page to be refreshed to see updates
to the automation in the edit automation card. This would be seen when
user tries to edit an automation multiple times (without a page refresh)
2024-08-11 15:52:46 +05:30
Debanjum Singh Solanky
f2f37ae444 Fix creating, editing automations that start weekly on Sunday 2024-08-11 15:52:46 +05:30
Debanjum Singh Solanky
ec9add9a51 Fix automation edit cards height. Scroll when card longer than screen 2024-08-11 15:52:46 +05:30
sabaimran
d99f03e4f3 If the list of choices in a chunk is empty, continue in openai response 2024-08-11 15:30:09 +05:30
Alexander Matyasko
f16b0f628b Fix true/false evaluation in Emacs to prevent unintended index re-indexing
Previously, the code incorrectly treated all non-nil values as true, leading to
the index being re-indexed with the force flag whenever the user selected to
update the index.
2024-08-11 17:24:11 +08:00
Alexander Matyasko
0e9e9648e6 Fix emacs if syntax 2024-08-11 17:24:11 +08:00
sabaimran
6f94a076f7 Add conversation_id parameter to the create_automation method 2024-08-11 10:45:13 +05:30
sabaimran
acb825f4f5 Bug fixes for automations
- Pass the new conversation id as kwarg for the scheduled_chat function
- For edit automations, re-use the original conversation id
- Parse images correctly for image automations
2024-08-11 10:41:43 +05:30
Debanjum Singh Solanky
5075d13902 Give visual feedback when interact with chat message feedback buttons
- Use color to provide visual feedback when hover, click on feedback
  buttons
- Use color to provide visual feedback when hover on speech, copy
  buttons click
- Add cooldown period before being able to send feedback on that message again.
  Avoids inadvertent multiple consecutive clicks on feedback buttons
2024-08-10 20:09:52 +05:30
Debanjum Singh Solanky
b3c6c8c84b Add OpenGraph metadata to web app pages for improve social share links 2024-08-10 18:14:05 +05:30
Debanjum Singh Solanky
fc411091c8 Add apple favicon, load favicons for each web app page from assets folder 2024-08-10 18:14:05 +05:30
Debanjum Singh Solanky
a7623e64fa Move Khoj webmanifest, assets to new web app public directory 2024-08-10 18:14:04 +05:30
sabaimran
af1d4b9ba4 Remove the premium requirement for speech for now 2024-08-10 14:10:12 +05:30
sabaimran
1d581464e6 Filter out any undefined agents when rendering the home page 2024-08-10 13:33:55 +05:30
sabaimran
acf1c14122 Release Khoj version 1.20.3 2024-08-09 18:11:11 +05:30
sabaimran
7d3a25f8c0 Handle processing case for the schedule leader process lock when it's empty 2024-08-09 16:37:06 +05:30
sabaimran
faf3584acd Fix automations edit button 2024-08-09 14:21:11 +05:30
sabaimran
5ef198a5b2 Improve default background color styling for inputs 2024-08-08 18:08:05 +05:30
sabaimran
c08b9e89f0 Update test_db_lock with new function name 2024-08-08 13:03:01 +05:30
sabaimran
64b2073e63 In the time-based job for managing the schedule leader, and logic to create a new lock when the current one is expired. 2024-08-08 12:42:59 +05:30
sabaimran
7ee0d9067d Fix apostrophe issue in copy text when commandempty in settings page 2024-08-08 11:41:10 +05:30
sabaimran
f28693c8c7 create a useismobilewidth method for standardized mobile view detection. 2024-08-07 21:04:44 +05:30
sabaimran
2943bed5d4 Update category colors 2024-08-07 18:51:31 +05:30
sabaimran
37afa3411f Improve the file upload experience in the settings page 2024-08-07 18:51:20 +05:30
sabaimran
1ee21f5150 Add support for showing files outside of conversation view and linking people to manage files in settings 2024-08-07 18:50:53 +05:30
sabaimran
93f4ceabc1 Add drag/drop file upload support to the chat input area 2024-08-07 18:50:19 +05:30
sabaimran
370ebdee24 Standardized the mobile width calculation 2024-08-07 18:49:06 +05:30
sabaimran
52fed6023f Overlay the side panel on top of other content 2024-08-07 18:46:06 +05:30
Alexander Matyasko
823f8d58bb Add model_config for crossencoder model
Add model_config for crossencoder model, so the user can use models
which require trust_remote_code.
2024-08-07 18:00:12 +08:00
sabaimran
09b71846be Remove favicon.ico as it's interfering with favicon rendering in the home page 2024-08-07 11:53:25 +05:30
Debanjum Singh Solanky
167ef000f4 Fix chat API for non-streaming mode json response 2024-08-06 19:27:54 +05:30
sabaimran
00ee4c2697 Release Khoj version 1.20.2 2024-08-06 16:16:33 +05:30
sabaimran
d4a8ff0683 Support workflow dispatch events for running the pypi.yml job 2024-08-06 15:55:39 +05:30
sabaimran
ccccb8e7e6 Just ignore the static directory outputting by django's static collection 2024-08-06 15:51:54 +05:30
sabaimran
c4be3b43e5 Add the compiled folder to the list of directories to look through for static templates 2024-08-06 14:50:44 +05:30
sabaimran
265d2a79be Remove duplicate assets from being included in the pypi output 2024-08-06 13:51:37 +05:30
sabaimran
24d0fdb262 Fix directory referenceds in pypi.yml configuration for compiled folder 2024-08-06 13:38:34 +05:30
sabaimran
23b1b36f8c Fix directory referenceds in pypi.yml configuration for compiled folder 2024-08-06 13:31:42 +05:30
sabaimran
81c75e1024 Fix static file folder path for the pypi build
- Since the .gitignore will ignore any of the assets in the src/ folder when building the package wheel, we need to output the static assets to another folder just for the python pypi package. Use /compiled for this.
2024-08-06 13:24:26 +05:30
sabaimran
694f551625 Fix mkdir step when copying generated files 2024-08-06 10:17:56 +05:30
sabaimran
7607abc726 Release Khoj version 1.20.1 2024-08-06 10:05:41 +05:30
sabaimran
e9f9d92989 Try to manually copy the built files into where the src directory should be for the pypi build 2024-08-06 10:05:06 +05:30
Debanjum
c23688e2de Fixes and Improvements Post Spring UX Release (#880)
- Auto focus on email input on login screen for smoother login experience
- Use file icon associated with search page results. Improve search bar
- Show logged in user's email in nav menu for context
- Use previous icons with eyes for search, agents and automations items in nav menu
2024-08-05 14:32:31 -07:00
Debanjum Singh Solanky
a4388c5e65 Use custom Khoj Icons for Search, Agents & Automation in Nav Menu
- Update agents, automations, search svg icons
2024-08-06 02:55:29 +05:30
sabaimran
e9d6899fc2 Change the way the export is created for the pypi package in order to transfer static files out of the tmp shell 2024-08-05 22:46:54 +05:30
sabaimran
b17577c138 Fix configuration for default voice model settings 2024-08-05 19:57:21 +05:30
Debanjum Singh Solanky
ec106d743d Use file icon associated with search page results. Improve search bar 2024-08-05 19:24:39 +05:30
Debanjum Singh Solanky
4258392fc7 Auto focus on email input on login screen for smoother login experience 2024-08-05 19:24:16 +05:30
Debanjum Singh Solanky
020a956c89 Show user email address on settings menu for logged in account context 2024-08-05 19:19:47 +05:30
sabaimran
998d08f155 Fix logic for deletion to automatically re-render the side pane 2024-08-05 18:07:20 +05:30
sabaimran
20d95dc45e Add the favicon.ico file to the public directory of app.khoj.dev 2024-08-05 18:04:03 +05:30
sabaimran
1eab6c8590 Add additional icons for agents, pencil line and chalkboard 2024-08-05 17:23:29 +05:30
sabaimran
bafda233e2 Add standlone khoj_domain for allowed_hosts 2024-08-05 17:11:37 +05:30
Debanjum Singh Solanky
e412ed3bcb Release Khoj version 1.20.0 2024-08-05 16:25:21 +05:30
Debanjum Singh Solanky
9f785dbafe Format web app package.json using prettier 2024-08-05 16:23:31 +05:30
Debanjum Singh Solanky
7d3a208f8b Update bump version script to bump new next.js web app version too 2024-08-05 16:20:47 +05:30
sabaimran
2a63439b16 Merge pull request #879 from khoj-ai/features/migrate-to-spring-ui
Migrate all existing pages except login to the new spring ui
2024-08-05 03:45:02 -07:00
sabaimran
b7ed32f455 Merge branch 'master' of github.com:khoj-ai/khoj into features/migrate-to-spring-ui 2024-08-05 16:12:46 +05:30
sabaimran
7e6b611a19 Fix typo for Obsidian 2024-08-05 15:55:06 +05:30
sabaimran
34d54c75f7 Lint new changes again 2024-08-05 15:54:50 +05:30
Debanjum Singh Solanky
7cb14ff07a Add dev setup script. Run prettier on web app pre-commit 2024-08-05 15:49:31 +05:30
sabaimran
91047d1619 Use a png for the windows desktop icon 2024-08-05 15:29:30 +05:30
sabaimran
1151d14466 Add a separate windows object in the todesktop configuration 2024-08-05 14:27:56 +05:30
sabaimran
c56072aa7b Update todesktop runtime and use the icns file for the todesktop configuration 2024-08-05 14:19:38 +05:30
sabaimran
484b0aa96b Use the newer, simpler favicon across desktop and documentation. Update the macos icon set 2024-08-05 14:06:04 +05:30
sabaimran
1b35a3b16e Fix link to login in the nav menu 2024-08-05 12:32:19 +05:30
sabaimran
5a5bbe3852 Remove deprecate views, assets 2024-08-05 12:31:47 +05:30
sabaimran
c61b289bd1 Migrate all existing pages except login to the new spring ui 2024-08-05 12:17:56 +05:30
sabaimran
f835e330b8 Fix selection of icons, colors, add examples for personal finance 2024-08-05 12:08:18 +05:30
sabaimran
af6a70c9fb Fix fuschia spelling in the colorutils file as well 2024-08-05 11:51:45 +05:30
sabaimran
e0775446c9 fix spelling of fuschia :( 2024-08-05 11:50:11 +05:30
sabaimran
de1cd8c264 Clean up some of the suggestions code, improve randomness of cards' 2024-08-05 11:19:50 +05:30
sabaimran
37e261ff93 Show connected icon when files or notion is indexed 2024-08-05 10:33:18 +05:30
sabaimran
8bc28fb11d Merge pull request #878 from khoj-ai/features/big-upgrade-chat-ux
Spring UI: Modernize UX for normie development
2024-08-04 21:32:18 -07:00
sabaimran
22cfedcaff In the chat history side panel, order conversations by updated time 2024-08-05 09:48:00 +05:30
sabaimran
8220dc6115 Include the updated_at datetime when returning a conversation session 2024-08-05 09:47:13 +05:30
Debanjum Singh Solanky
e296d387e1 Clean duplicate title shown in reference snippets of hierarchical docs
Hierarchical documents like org-mode, markdown have their ancestry
shown in first line. Remove it to show cleaner, deduplicated reference
text from org-mode, markdown files
2024-08-05 04:59:06 +05:30
Debanjum Singh Solanky
95c2a52775 Show file icons in references for first party supported document types
Add org, markdown, pdf, word, icon and default file icons to simplify
identifying file type used as reference for generating chat response
2024-08-05 04:59:06 +05:30
Debanjum Singh Solanky
18a973b666 Fix name of Khoj logo component file in web app 2024-08-05 04:59:06 +05:30
Debanjum Singh Solanky
842036688d Format next.js web app with prettier 2024-08-05 04:59:06 +05:30
Debanjum Singh Solanky
41bdd6d6d9 Throw warning on prettier formatting issues in web app 2024-08-05 03:58:20 +05:30
Debanjum Singh Solanky
1cdfa8087c Update Khoj tagline to "Your Second Brain" 2024-08-05 02:27:05 +05:30
Debanjum Singh Solanky
46f928165c Fix deep linking to settings page cards from docs 2024-08-05 02:27:05 +05:30
sabaimran
f7840782a4 Fix broken rendering of math equations via katex 2024-08-05 00:20:43 +05:30
Debanjum Singh Solanky
b803ed19d3 Add simplified, cleaner khoj logo images to web app static dir 2024-08-04 23:40:21 +05:30
sabaimran
69c3635ce7 Merge pull request #877 from khoj-ai/features/fit-and-finish-new-ux
Fit and finish updates for the new UX
2024-08-04 10:26:33 -07:00
Debanjum Singh Solanky
51e56e17ee Align padding of agent pills to home screen chat input on small screens 2024-08-04 21:57:54 +05:30
Debanjum Singh Solanky
b744dffefd Align voice message button with send chat message button style 2024-08-04 21:04:38 +05:30
Debanjum Singh Solanky
70f670dcf7 Show send button when text in chat input else voice message button
Utilize chat footer space more efficiently. This is especially useful
on small screens

- Send button is anyway only enabled when there is text in chat input
- Otherwise voice message button is better to show by default
2024-08-04 19:25:49 +05:30
Debanjum Singh Solanky
c627527a6f Reorder automation card actions buttons. Put Delete action last 2024-08-04 19:01:11 +05:30
Debanjum Singh Solanky
c7b67a978e Align agents and automation page structure, widths and spacings
- Remove invalid call to styles.main
- Remove unnecessary top padding above side pane to keep side pane at
  consistent position across web app
- Use same pageLayout styles and styling structure on agent like
  automation
- Vertically center automation section and page title on it's row
- Fix applying flex vs grid with tailwind
2024-08-04 19:01:11 +05:30
Debanjum Singh Solanky
60af173c4a Improve responsive spacing of chat page footer buttons
- Remove x axis footer padding on small screens to preserve space,
  keep equal spacing between footer items
- Add 1rem margin to buttons to not have overlap in boundary
- Add 1rem y-axis padding to chat footer to not have focus boundary
  leave the footer boundary on smaller screens
2024-08-04 19:01:10 +05:30
sabaimran
4f2fcc82f0 Make the input area only rounded on the top corners when in mobile view
- Create better styling for the input area buttons, resizing in mobile and creating more even height with a more minimal send button
2024-08-04 18:28:33 +05:30
sabaimran
322fb34d4b Add top padding to the automations header to align it with the agents page 2024-08-04 12:27:37 +05:30
sabaimran
3e1e4a1857 Move the clients section back to the bottom 2024-08-04 11:32:22 +05:30
Debanjum Singh Solanky
caf5c3d74c Link to Khoj manifest in home page metadata to support PWA install
Installing Khoj as PWA was supported in previous web UX as well. This
just adds link to the existing webmanifest to continue support for
installing Khoj as PWA with new web UX
2024-08-04 05:06:38 +05:30
Debanjum Singh Solanky
692058bbdd Fix time of day calculation logic
Previously between 00:00 - 04:00 it'd trigger afternoon insteead of
evening
2024-08-04 04:53:50 +05:30
Debanjum Singh Solanky
015c155582 Simplify structure of chat page to match other pages 2024-08-04 04:43:55 +05:30
Debanjum Singh Solanky
bf71e472c4 Load static assets from Khoj server in dev environment 2024-08-04 04:25:48 +05:30
Debanjum Singh Solanky
f38c072f07 Update chat session title in side pane to new title after rename
Previously the rename wasn't updating the chat session title. We'd
have to refresh the page or side pane to get latest chat session names
after rename action.
2024-08-04 04:25:48 +05:30
Debanjum Singh Solanky
2f7a8698a0 Fix width and equalize spacing between buttons in chat footer
Previously the footer's right border wasn't visible on small screens
due to usage of w-full

Use mr-1 on send button instead of px-1 on chat input parent to
eualize chat footer buttons spacing
2024-08-04 04:25:48 +05:30
Debanjum Singh Solanky
5541bc09c8 Prefix Khoj page breadcrumbs to chat page title for orientation
Allows tab search by looking at standard prefix. Still allows
page title based identification of different Khoj chat sessions
2024-08-04 04:25:48 +05:30
Debanjum Singh Solanky
6a9865ace7 Only show API keys card in non anon mode
- Show informative toast messages on copy, delete of API keys
- Onle show API keys card in non anonymous mode. API keys aren't
  required (and is disabled on server side) in anon mode. Not showing
  card at all in anon mode reduces chance of unnecessary confusion
2024-08-04 04:25:45 +05:30
Debanjum Singh Solanky
f28208d35b Only show chat sessions uptil last month in side pane
- Reduce chat title size
2024-08-04 01:52:08 +05:30
sabaimran
75559a55aa only show search if logged in. update agents icon 2024-08-03 23:23:03 +05:30
sabaimran
185dcb61f7 Update the settings page to better match the design 2024-08-03 20:49:19 +05:30
sabaimran
3e74d383fe Strip quotes from the response mode llm response 2024-08-03 17:33:20 +05:30
sabaimran
87e97e40f4 Resolve various warnings during export 2024-08-03 17:33:04 +05:30
sabaimran
5a75f2c00f Use filled icons when side panel is open 2024-08-03 15:42:49 +05:30
sabaimran
e6260a7bb6 Improve oadding for h9me page chat iput area and inc margin on api keys 2024-08-03 15:33:33 +05:30
Debanjum Singh Solanky
7a8a9fc807 Auto focus cursor on search input when open search page 2024-08-03 13:52:36 +05:30
Debanjum Singh Solanky
30304ccc56 Fix session drawer to fit title, action triple-dot in width on mobile 2024-08-03 13:52:36 +05:30
Debanjum Singh Solanky
5b17fa5dda Set home, chat page height so footer, header visible w/o scroll on phone
Set dynamic view height of page to 100%
2024-08-03 13:52:36 +05:30
sabaimran
687a881ad2 Remove the agents header in the loading view 2024-08-03 13:44:56 +05:30
sabaimran
0db630a123 image cards should be /image, not /paint 2024-08-03 13:44:31 +05:30
sabaimran
261f62e353 Fix automations mobile view by using a wrapper component that chooses a dialog or a drawer 2024-08-03 13:44:17 +05:30
sabaimran
4ce17acd00 Set greeting message to longer text in default view. Only show two agents in mobile 2024-08-03 12:14:58 +05:30
sabaimran
6c35ee4960 Revert height of the side panel on the home page 2024-08-03 11:59:07 +05:30
Debanjum Singh Solanky
e66adf60c5 Have the home and chat page take full height, reduce greeting top space 2024-08-03 11:54:12 +05:30
Debanjum Singh Solanky
cf8745ef78 Improve structure of chat footer on mobile to put agents above input 2024-08-03 11:31:57 +05:30
Debanjum Singh Solanky
529ffdb7e3 Make Title, Chat Footer Icons larger to ease click, tap on Mobile 2024-08-03 11:23:29 +05:30
Debanjum Singh Solanky
8d1c5226ec Remove unnecessary debug logs 2024-08-03 09:55:31 +05:30
sabaimran
f136214290 Improve the nav menu in the not logged in experience 2024-08-03 09:44:04 +05:30
sabaimran
f9606ce9b7 Merge branch 'features/fit-and-finish-new-ux' of github.com:khoj-ai/khoj into features/fit-and-finish-new-ux 2024-08-03 09:34:04 +05:30
Debanjum Singh Solanky
d8fe677933 Prevent overflow on Search page by search results 2024-08-03 07:07:35 +05:30
Debanjum Singh Solanky
f3765a20b9 Improve content alignment on automation page for small screens
- Left align email, location, timezone pills on small screens
- Indent user enabled automations to improve delineation between
  sections
2024-08-03 07:05:15 +05:30
Debanjum Singh Solanky
a6e1b2c7cb Style nav menu button and expand nav menu item click area to full-width
Style profile pircture button on nav menu
 - Use primary colored ring around subscribed user profile on nav menu
 - Use gray colored ring around non-subscribed user profile on nav menu
 - Use upper case initial as profile pic for user with no profile pic

- Click anywhere on nav menu item to trigger action
  Previously the actual clickable area was smaller than the width of
  the nav menu item
2024-08-03 05:43:24 +05:30
Debanjum Singh Solanky
eed9e401a2 Improve alignment of title bar elements 2024-08-03 04:11:58 +05:30
sabaimran
f188396395 Prompt to login when authenticated, click on suggestion card
- Improve styling for the side panel when not logged in
2024-08-03 01:42:32 +05:30
sabaimran
9c5ff1699a Use new nav menu alignment in the settings page 2024-08-03 01:42:32 +05:30
sabaimran
b1d3979ed9 Fix navmenu in settings, share/chat pages 2024-08-03 01:42:21 +05:30
sabaimran
5f8b76c8f2 Fix layout/styling of the factchecker app 2024-08-03 01:07:59 +05:30
sabaimran
1bb746aaed Adjust spacing when side panel is opened 2024-08-03 01:07:59 +05:30
sabaimran
07b3bdf181 Update nav menu styling to include everything in one header
- Move the nav menu into the chat history side panel component, so that they both show up on one line
- Update all pages to use it with the new formatting
- in mobile, present the sidebar button, home button, and profile button evenly centered in the middle
2024-08-03 01:07:55 +05:30
Debanjum Singh Solanky
e62888659f Only show greeting once userConfig is fetched from server
- Pass userConfig from Home as prop to chatBodyData component with
  loading state
- Pass loading state of userConfig to allow components to handle
  rendering dependent elements once it is loaded
2024-08-02 20:25:09 +05:30
Debanjum Singh Solanky
0adee07d40 Update home page greetings to use user name, when available 2024-08-02 20:25:09 +05:30
sabaimran
bbe7491f2f Prompt to login when authenticated, click on suggestion card
- Improve styling for the side panel when not logged in
2024-08-02 20:12:18 +05:30
sabaimran
d48a789442 Use new nav menu alignment in the settings page 2024-08-02 19:44:30 +05:30
sabaimran
e6014e89bf Fix navmenu in settings page 2024-08-02 19:28:59 +05:30
sabaimran
1509c536f9 Fix layout/styling of the factchecker app 2024-08-02 19:06:01 +05:30
sabaimran
0d8cdee60a Adjust spacing when side panel is opened 2024-08-02 17:49:50 +05:30
sabaimran
d3c07a098d Update nav menu styling to include everything in one header
- Move the nav menu into the chat history side panel component, so that they both show up on one line
- Update all pages to use it with the new formatting
- in mobile, present the sidebar button, home button, and profile button evenly centered in the middle
2024-08-02 17:46:13 +05:30
sabaimran
5a8ea884a9 Use new HTTP stream format within the new UX
Use updated format for HTTP streamed responses from the Khoj server in the new chat UX
Remove references to the websocket connected field, as websocket use has been deprecated
2024-08-02 02:35:10 -07:00
Debanjum Singh Solanky
02b46a1784 Render references after chat response is streamed for smoother render
Otherwise the Khoj's chat response is filling up in between the
streamed message and already rendered references section at the bottom
of the message

Define OnlineContext type to simplify typing online context param
across other interfaces and functions
2024-08-02 14:11:34 +05:30
Debanjum Singh Solanky
a733e5c1d4 Remove unused handleCompiledReferences chat functions 2024-08-02 13:18:55 +05:30
Debanjum Singh Solanky
7858aff2e2 Trigger welcomeConsole only once on chat, shared chat page load 2024-08-02 13:18:01 +05:30
Debanjum Singh Solanky
cab0957fd3 Just show Khoj logo on title bar on small screens
Continue to show logo + text on larger screens
2024-08-02 13:18:01 +05:30
Debanjum Singh Solanky
3f607b3978 Add icons, improve description of home, chat & search page metadata 2024-08-02 13:18:01 +05:30
Debanjum Singh Solanky
4f783b911c Update DOMPurify imports correctly to resolve compilation warnings 2024-08-02 13:18:01 +05:30
sabaimran
4492017b96 Move processmessagechunk file into a common chat function 2024-08-02 12:31:43 +05:30
sabaimran
13dee7d89e Remove status update for understanding query 2024-08-01 19:22:21 +05:30
sabaimran
6babd5c0ce Merge pull request #876 from khoj-ai/features/use-intl-phone-input-settings
Use international phone number input and verify whatsapp flow
2024-08-01 04:52:02 -07:00
sabaimran
1b2cad2a2c Use af in the default state and configure the phone number input styling 2024-08-01 17:04:57 +05:30
sabaimran
723b37955a Disable input for phone number only if its pending verification 2024-08-01 16:45:38 +05:30
sabaimran
84dd1b57fe Use an intl phone input number field and fix the whole verification flow
- There were some state mismatches in configuring a whatsapp number. This commit fixes those issues and uses an external library for phone number validation
2024-08-01 16:44:17 +05:30
sabaimran
ed16914ac3 Remove deprecated fields and fix erroneous export in settings page 2024-08-01 14:45:54 +05:30
sabaimran
7941f4d54d Remove references to deprecated setupwebsocket function 2024-08-01 14:43:17 +05:30
sabaimran
db93ac5d4b Merge branch 'features/big-upgrade-chat-ux' of github.com:khoj-ai/khoj into features/use-new-sse-in-new-chat-ux 2024-08-01 14:41:50 +05:30
sabaimran
fd0e0405af Fix logic for setting and sending the initial chat message from the home page
- Load agents only once when the page loads, rather than triggering constant re-renders
2024-08-01 13:53:16 +05:30
sabaimran
9a43622cef Remove usages of the websocketconnected variable 2024-08-01 13:14:23 +05:30
sabaimran
bfeb64b48f Migrate the shared chat page to also use the new SSE streaming format 2024-08-01 13:14:09 +05:30
sabaimran
833553c3a3 Move conversation commands selection earlier to include in telemetry collected 2024-08-01 12:52:41 +05:30
sabaimran
dbbcf2564f Remove the usage of emojis in the incremental status updates 2024-08-01 12:52:05 +05:30
sabaimran
cd85a51980 Ingest new format for server sent events within the HTTP streamed response
- Note that the SSR for next doesn't support rendering on the client-side, so it'll only update it one big chunk
- Fix unique key error in the chatmessage history for incoming messages
- Remove websocket value usage in the chat history side panel
- Remove other websocket code from the chat page
2024-08-01 12:50:43 +05:30
Debanjum
60870a7a3e Create Settings Page in new Web App (#872)
- Details
  - Add Profile Client, Content Sections
  - Make Multi Step Cards for Whatsapp, Files, Notion Integrations
  - Align Settings page with new Baraabar UX
2024-07-30 06:59:42 -07:00
Debanjum Singh Solanky
32ce564b7c Remove unused Files Connect button and setup Github content card 2024-07-30 18:55:14 +05:30
Debanjum Singh Solanky
ecb873c488 Only allow search model to be updated without being subscribed
Do not make fetch request to server if user is not subscribed
2024-07-30 18:50:57 +05:30
Debanjum Singh Solanky
f58cff5bcc Increase rate limit in the api/content vs deprecated indexer API 2024-07-30 16:09:26 +05:30
Debanjum Singh Solanky
f0bb6883f8 Improve Delete experience on Files Card in Settings Page
Improve placeholder text for notion API key and Whatsapp
number (mention country code required)
2024-07-30 15:25:14 +05:30
sabaimran
b1eb564706 remove the optional pydantic typing from the files param 2024-07-30 15:25:14 +05:30
sabaimran
4a7efdc552 Use patch in place of put in the indexer API call, ensure that files are not being required in the indexer path 2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky
ffbf57292c Create synced files management modal on the settings page
Use a Command Dialog to allow easier filtering of files to view
without having to leave the settings page
2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky
ccc46a09b5 Add new API to batch delete a list of files by filename
- Rearrange DELETE content API definitions order to go from more
  specific to more general
- Create batched file deletion DB adapter
2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky
9d86cb57ac Build UX to Connect and Manage Notion Integration via Settings Page 2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky
7ee179ee1f Return user's Notion token in API call for detailed user settings 2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky
00a908ae12 Move subscription card to Profile settings section. Remove Billing section
- Why
  Profile section and billing section looked too empty (1 card each).
  Combining them makes the setting page look more complete. Shows
  subscription options early on
- Details
  - Made Futurist text orange
  - Made Unsubscribe a down button instead of cloud slash
  - Updated toast title to subscription
  - Improve Futurist trial title and description
2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky
058c902dc7 Delete unused npm package-lock.json as Web app uses yarn.lock instead 2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky
b8c9b3ffa3 Reduce padding height of input area on new home page 2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky
8a447107dd Set user name on clicking Save button on settings page 2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky
44e0b20202 Align Content, Client & Billing settings sections with new designs 2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky
51e83bcc26 Improve responsive behavior of settings cards 2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky
efcad4996d Add phone number verification for Whatsapp to new settings page 2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky
48548684c0 Add card to connect Whatsapp to Khoj on settings page 2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky
8ec90f194f Add title icons for each content section card on settings page 2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky
60cdf61737 Create billing section for managing subscription on settings page
- Replicate behavior on current settings.html page
- Improve text for each subscription state to make it more informative, fun
2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky
2e165a0e0a Create client API keys section on settings page
- Add table shadcn component to use in API keys settings section
- In dev mode, route requests to auth to khoj server at localhost:42110
2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky
00fa4fa0fa Save model on selecting model in dropdown. No extra save action reqd
- Remove now unnecessary button to Save in Card with dropdown
- Use toast to show success, failure (not working)
- Rename language to search, Move it to features section. Add icon to
  the card
2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky
13292fc4ca Add icons to card headings 2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky
a5a06da3fc Use Dropdown component for model options. Make cards more responsive
- Ensure model name doesn't stretch or shrink dropdown width from
  parent card width
- Ensure buttons flex wrap on smaller displays
2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky
ade2f6f5d1 Rename selected voice model in get config API response for consistency
- Update references in new and old web client settings
- Arrange new client settings props and add header comments similar to
- config response for code readability
2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky
b3253562a5 Dynamically set Content cards buttons based on already setup or not 2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky
7e8e80f29e Create config page using tailwind, shadcn components, styling
- Include side pane but with only the account info in it
- Replicate styling of the old config page
2024-07-30 15:25:14 +05:30
Debanjum Singh Solanky
88007d7552 Get user config in the new web client from the new user config APIs 2024-07-30 15:25:14 +05:30
sabaimran
a6339bb973 Add mroe card suggestions and simplify color selection for suggestion cards 2024-07-29 19:11:39 +05:30
sabaimran
551630f0f1 Code clean-up and some fit and finish
- Add a lot more suggestions cards, improve mobile rendering of suggestion cards, improve alignment of chat input, shift message when starts recording voice, remove dead code
2024-07-28 15:19:36 +05:30
sabaimran
413255ddc7 Add closing tag to whatsapp qr code image 2024-07-28 13:50:38 +05:30
sabaimran
41eb85c933 Update the docs for whatsapp to include the QR code 2024-07-28 13:43:50 +05:30
sabaimran
1a1d9c7257 Merge branch 'master' of github.com:khoj-ai/khoj into features/big-upgrade-chat-ux 2024-07-27 14:18:05 +05:30
Raghav Tirumale
1685c60e3c Nav Menu Upgrades and Minor UX Improvements (#869)
* Converted navigation menu into a dropdown menu
* Moved collapsed side panel menu icons into top row
* Auto refresh when conversation is deleted to update side panel and route back to main page if deletion is on current conversation
* Highlight the current conversation in the side panel
* Dynamic homepage messages with current day and time of day.
* `colorutils` upgraded to have more expansive tailwind color options and dynamic class name generation.
* Converted create agent button alert into shadcn `ToolTip`
* Colored lines and icons for agents in chat window
* Cleaned up border styling in dark mode
* fixed three dot menu in side panel to be more easier to click
* Add the KhojLogo import in the nav menu and use a default user profile icon when not authenticated
* Get rid of custom --box-shadow CSS variable
* Pass the agent metadat through the chat body data in order to style the send button
* Add login to the unauthenticated login view, redirecto to home if conversation history not loaded
* Set a max height for the input text area
* Simplify tailwind class names

---------

Co-authored-by: sabaimran <narmiabas@gmail.com>
2024-07-27 14:12:00 +05:30
Debanjum
8503d7a07b Split Configure API into Content, Model API paths (#857)
## Major: Breaking Changes
- Move API endpoints under /configure/<type>/model to /api/model/<type>
- Move API endpoints under /api/configure/content/ to /api/content/
- Accept file deletion requests by clients during sync
- Split /api/v1/index/update into /api/content PUT, PATCH API endpoints

## Minor: Create New API Endpoint
- Create API endpoints to get user content configurations

Related: #852
2024-07-26 23:48:41 -07:00
Debanjum Singh Solanky
878cc023a0 Fix and improve openai chat actor tests
- Use new form of passing doc references to now passing chat actor
  test
- Fix message list generation from conversation logs provided
  Strangely the parent conversation_log gets passed down to
  message_to_log func when the kwarg is not explicitly specified
2024-07-26 23:53:47 +05:30
Debanjum Singh Solanky
a47a54f207 Pass user name to document and online search actors prompts
This should improve the quality of personal information extraction
from document and online sources. The user name is only used when it
is set
2024-07-26 23:53:17 +05:30
sabaimran
e86143dbb0 Merge pull request #867 from khoj-ai/features/search-page-v2
Update the search page
2024-07-26 08:08:04 -07:00
sabaimran
eb5af38f33 Release Khoj version 1.17.0 2024-07-26 20:14:45 +05:30
Raghav Tirumale
5dcac18ba5 New Agents Page User Interface (#866)
Changes for new agents page
- Modernized agent cards
- Responsive design to support mobile users
- Button for users to create their own agents (coming soon)
- Optimized to use tailwind and icon utils
- Side panel added for quick access to conversations
2024-07-26 20:12:31 +05:30
Debanjum Singh Solanky
3daef910c0 Remove stale comment from api content 2024-07-26 20:05:35 +05:30
sabaimran
44d34f9090 Update the unit test for the subscribed user 2024-07-26 19:59:01 +05:30
sabaimran
377f7668c5 Merge pull request #858 from khoj-ai/use-sse-instead-of-websocket
Use Single HTTP API for Robust, Generalizable Chat Streaming
2024-07-26 07:11:54 -07:00
sabaimran
6607e666dc Increase rate limit for data upload packet size in indexer.py 2024-07-26 19:35:32 +05:30
Debanjum Singh Solanky
778c571288 Use enum to track chat stream event types in chat api router 2024-07-26 00:19:43 +05:30
sabaimran
7482797605 Add some better default states for no files found, prompt to search. Add link to search in the file search compnoent in side panel 2024-07-25 13:00:28 +05:30
sabaimran
662dffea3b Press enter to search 2024-07-24 19:28:38 +05:30
sabaimran
19cd607c96 Style the see content button correctly 2024-07-24 18:28:23 +05:30
sabaimran
75a370cc06 Implement focus mode to click into full text of the note 2024-07-24 18:00:33 +05:30
sabaimran
5adbfe14ab Add a search page that just renders truncated results when you click search 2024-07-24 17:43:19 +05:30
sabaimran
52db15706d Remove unused styling 2024-07-24 17:42:36 +05:30
sabaimran
cfe7a1068e Update the navmenu title if prop is updated and undefined 2024-07-24 17:41:31 +05:30
Debanjum Singh Solanky
ebe92ef16d Do not send references twice in streamed image response
Remove unused image content to reduce response payload size.
References are collated, sent separately
2024-07-24 17:18:14 +05:30
Debanjum Singh Solanky
37b8fc5577 Extract events even when http chunk contains partial or mutiple events
Previous logic was more brittle to break with simple unbalanced
'{' or '}' string present in the event data. This method of trying to
identify valid json obj was fairly brittle. It only allowed json
objects or processed event as raw strings.

Now we buffer chunk until we see our unicode magic delimiter and only
then process it.

This is much less likely to break based on event data and the
delimiter is more tunable if we want to reduce rendering breakage
likelihood further
2024-07-24 17:17:39 +05:30
sabaimran
4d30e5b158 Fix indexing error for notion, expecting image and docx in dict 2024-07-24 16:58:31 +05:30
sabaimran
694bedc25b Add support for text to speech and speech to text (#863)
- Add support for text to speech, speech to text. Add loading and responsive indicators to reflect state.
- When streaming for speech to text, show incremental transcription in the message input field
- When streaming text to speech, and a pause button in the chat message to allow user to stop playback
2024-07-24 14:36:40 +05:30
Raghav Tirumale
3e4325edab Upgrade: New Home Screen for Khoj (#860)
* V1 of the new automations page
Implemented:
- Shareable
- Editable
- Suggested Cards
- Create new cards
- added side panel new conversation button
- Implement mobile-friendly view for homepage
- Fix issue of new conversations being created when selected agent is changed
- Improve center of the homepage experience
- Fix showing agent during first chat experience
- dark mode gradient updates

---------

Co-authored-by: sabaimran <narmiabas@gmail.com>
2024-07-24 13:16:19 +05:30
Debanjum Singh Solanky
70201e8db8 Log total, ttft chat response time on start, end llm_response events
- Deduplicate code to collect chat telemetry by relying on
  end_llm_response event
- Log time to first token and total chat response time for latency
  analysis of Khoj as an agent. Not just the latency of the LLM
- Remove duplicate timer in the image generation path
2024-07-23 23:21:12 +05:30
Debanjum Singh Solanky
b36a7833a6 Remove the old mechanism of streaming compiled references
Do not need response generator to stuff compiled references in chat
stream using "### compiled references:" separator.

References are now sent to clients as structured json while streaming
2024-07-23 19:53:51 +05:30
Debanjum Singh Solanky
eb4e12d3c5 s/online_context/onlineContext chat API response field for consistency
This will align the name of the online context field returned by
current chat message and chat history
2024-07-23 19:50:43 +05:30
Debanjum
498fe2458c Support Gemma 2 Model Family for Offline Chat (#855)
## Overview
- Gemma 2 is a new open model family by Google. They've released a 9B, 29B param model. A 2B model is also expected.
- It performs really well on the Chatbot arena and shows good performance when testing within Khoj as well.
- Llama.cpp support for Gemma 2 architecture seems to have stabilized
- If Gemma 2 performs well in further testing, it can be made the default offline chat model for Khoj
  - Once the 2B param model is released, the model size to download can be automatically chosen based on (V)RAM available

## Major
- Support Gemma 2 for Offline Chat
- Improve and fix chat model prompts for better, consistent context

## Minor
- Fix and improve offline chat actor, director tests
- Improve offline chat truncation to consider chat message delimiter tokens
2024-07-23 06:57:02 -07:00
Debanjum Singh Solanky
0277d16daf Share desktop chat streaming utility funcs across chat, shortcut views
Null check menu, menuContainer to avoid errors on Khoj mini
2024-07-23 19:16:33 +05:30
Debanjum Singh Solanky
e439a6ddac Use async/await in web client chat stream instead of promises
Align streaming logic across web, desktop and obsidian clients
2024-07-23 18:17:47 +05:30
Debanjum Singh Solanky
fafc467173 Put loading spinner at bottom of chat message in web client 2024-07-23 18:17:47 +05:30
Debanjum Singh Solanky
fc33162ec6 Use new chat streaming API to show Khoj train of thought in Desktop app
Show loading spinner at end of current message
2024-07-23 18:17:47 +05:30
Debanjum Singh Solanky
c5ad172616 Keep loading animation at message end & reduce lists padding in Obsidian
Previously loading animation would be at top of message. Moving it to
bottom is more intuitve and easier to track.

Remove white-space: pre from list elements. It was adding too much y
axis padding to chat messages (and train of thought)
2024-07-23 17:56:03 +05:30
Debanjum Singh Solanky
54b4203683 Update chat API client tests to mix testing of batch and streaming mode 2024-07-23 17:56:03 +05:30
Debanjum Singh Solanky
3f5f418d0e Use new chat streaming API to show Khoj train of thought in Obsidian client 2024-07-23 17:56:03 +05:30
Debanjum Singh Solanky
8303b09129 Convert snake case to camel case in chat view of obsidian plugin 2024-07-23 15:29:12 +05:30
Debanjum Singh Solanky
b224d7ffad Simplify get_conversation_by_user DB adapter code 2024-07-23 14:51:11 +05:30
Debanjum Singh Solanky
daec439d52 Replace old chat router with new chat router with advanced streaming
- Details
  Only return notes refs, online refs, inferred queries and generated
  response in non-streaming mode. Do not return train of throught and
  other status messages

  Incorporate missing logic from old chat API router into new one.

- Motivation
  So we can halve chat API code by getting rid of the duplicate logic
  for the websocket router

  The deduplicated code:
  - Avoids inadvertant logic drift between the 2 routers
  - Improves dev velocity
2024-07-23 14:51:11 +05:30
Debanjum Singh Solanky
2d4b284218 Simplify streaming chat function in web client 2024-07-23 14:38:55 +05:30
Debanjum Singh Solanky
6b9550238f Simplify advanced streaming chat API, align params with normal chat API 2024-07-22 22:51:24 +05:30
Debanjum Singh Solanky
b8d3e3669a Stream Status Messages via Streaming Response from server to web client
- Overview
Use simpler HTTP Streaming Response to send status messages, alongside
response and references from server to clients via API.

Update web client to use the streamed response to show train of thought,
stream response and render references.

- Motivation
This should allow other Khoj clients to pass auth headers and recieve
Khoj's train of thought messages from server over simple HTTP
streaming API.

It'll also eventually deduplicate chat logic across /websocket and
/chat API endpoints and help maintainability and dev velocity

- Details
  - Pass references as a separate streaming message type for simpler
    parsing. Remove passing "### compiled references" altogether once
    the original /api/chat API is deprecated/merged with the new one
    and clients have been updated to consume the references using this
    new mechanism
  - Save message to conversation even if client disconnects. This is
    done by not breaking out of the async iterator that is sending the
    llm response. As the save conversation is called at the end of the
    iteration
  - Handle parsing chunked json responses as a valid json on client.
    This requires additional logic on client side but makes the client
    more robust to server chunking json response such that each chunk
    isn't itself necessarily a valid json.
2024-07-22 15:41:21 +05:30
Debanjum Singh Solanky
91fe41106e Convert Websocket into Server Side Event (SSE) API endpoint
- Convert functions in SSE API path into async generators using yields
- Validate image generation, online, notes lookup and general paths of
  chat request are handled fine by the web client and server API
2024-07-21 14:20:22 +05:30
sabaimran
9cf52bb7e4 Update automations UX for more consistency (#856)
* Update the automations UI to be a more suitable color distribution based on new designs

* Use accented colors for the metadata, update dark mode colors

* Update form to use icons as well and render more pretty inline form labels
2024-07-21 12:22:23 +05:30
sabaimran
e694c82343 Fix Docker build issues with yarn / next /node (#859)
* Rollback node version being installed from nodesource to node 20
2024-07-19 19:11:29 +05:30
sabaimran
1af9dbb083 Switch node/yarn install steps to use more native installation patterns 2024-07-19 17:10:08 +05:30
sabaimran
6d5ca5a3e1 yarn clean cache before build 2024-07-19 16:06:38 +05:30
sabaimran
7f0d1bd414 Add verbose logs when outputing yarn install steps 2024-07-19 15:48:43 +05:30
sabaimran
7426a4f819 Prefetch related agent when retrieving the conversation for performance improvements 2024-07-19 14:43:30 +05:30
Debanjum Singh Solanky
07f36fa95a Update new web interface with update calls to /content, /model APIs 2024-07-19 12:23:22 +05:30
Debanjum Singh Solanky
f03525f431 Add back /api/configure as /api/settings API endpoint
It had been removed during the /api/configure/content to /api/content
API migration before
2024-07-19 05:40:34 +05:30
Debanjum Singh Solanky
3832ef0236 Move API endpoints under /api/configure/phone/ to /api/phone/
Pull out /api/configure/phone API endpoints into /api/phone for
more concise and sufficiently explanatory API path

Refactor Flow
1. Rename /api/configure/phone -> /api/phone
2024-07-19 05:40:34 +05:30
Debanjum Singh Solanky
1197266912 Move API endpoints under /configure/<type>/model to /api/model/<type>
Now the API to configure all the AI models is under /api/models.
This provides better organization and API hierarchy. The /configure
url segment was redundant.

- Rename POST /api/phone to PATCH /api/phone
- Rename GET /api/configure to GET /api/settings

Refactor Flow
1. Move out POST /user/name to main api.py
2. Rename /api/configure/<type>/model -> /api/model/<type>
3. Rename @api_configure to @api_mode
4. Rename file api_config.py to api_model.py
2024-07-19 05:40:34 +05:30
Debanjum Singh Solanky
469a1cb6a2 Move API endpoints under /api/configure/content/ to /api/content/
Pull out /api/configure/content API endpoints into /api/content to
allow for more logical organization of API path hierarchy

This should make the url more succinct and API request intent more
understandable by using existing HTTP method semantics along with the
path.

The /configure URL path segment was either
- redundant (e.g POST /configure/notion) or
- incorrect (e.g GET /configure/files)

Some example of naming improvements:
- GET /configure/types -> GET /content/types
- GET /configure/files -> GET /content/files
- DELETE /configure/files -> DELETE /content/files

This should also align, merge better the the content indexing API
triggered via PUT, PATCH /content

Refactor Flow
1. Rename /api/configure/types -> /api/content/types
2. Rename /api/configure -> /api
3. Move /api/content to api_content from under api_config
2024-07-19 05:40:34 +05:30
Debanjum Singh Solanky
bba4e0b529 Accept file deletion requests by clients during sync
- Remove unused full_corpus boolean. The full_corpus=False code path
  wasn't being used (accept for in a test)
- The full_corpus=True code path used was ignoring file deletion
  requests sent by clients during sync. Unclear why this was done
- Added unit test to prevent regression and show file deletion by
  clients during sync not ignored now
2024-07-19 04:53:01 +05:30
Debanjum Singh Solanky
5923b6d89e Split /api/v1/index/update into /api/content PUT, PATCH API endpoints
- This utilizes PUT, PATCH HTTP method semantics to remove need for
  the "regenerate" query param and "/update" url suffix
- This should make the url more succinct and API request intent more
  understandable by using existing HTTP method semantics
2024-07-19 01:45:53 +05:30
Debanjum Singh Solanky
e9f86e320b Fix and improve offline chat actor, director tests
- Use updated references schema with compiled key
- Enable director tests that are now expected to pass and that do pass
  (with Gemma 2 at least)
2024-07-18 03:43:09 +05:30
Debanjum Singh Solanky
b0ee78586c Improve offline chat truncation to consider message separator tokens 2024-07-18 03:43:09 +05:30
Debanjum Singh Solanky
6f46e6afc6 Improve and fix chat model prompts for better, consistent context
- Add day of week to system prompt of openai, anthropic, offline chat models
- Pass more context to offline chat system prompt to
  - ask follow-up questions
  - know where to find information about khoj (itself)
- Fix output mode selection prompt. Log error if model does not select
  valid option from list of valid output modes provided
- Use consistent names for question, answers passed to
  extract_questions_offline prompt

- Log which model extracts question, what the offline chat model sees
  as context. Similar to debug log shown for openai models
2024-07-18 03:43:09 +05:30
Debanjum Singh Solanky
53eabe0c06 Support Gemma 2 for Offline Chat
- Pass system message as the first user chat message as Gemma 2
  doesn't support system messages
- Use gemma-2 chat format
- Pass chat model name to generic, extract questions chat actors
  Used to figure out chat template to use for model
  For generic chat actor argument was anyway available but not being
  passed, which is confusing
2024-07-18 03:09:38 +05:30
Debanjum Singh Solanky
65dade4838 Create API endpoints to get user content configurations
This is to be used by the new Next.js web client
2024-07-17 13:41:14 +05:30
Debanjum
2ab8fb78b1 Migrate the PyPI package to use project name: khoj (#853)
### Changes
- Deprecate [khoj-assistant](https://pypi.org/project/khoj-assistant) pypi package. Use more accurate and succinct pypi project name, [khoj](https://pypi.org/project/khoj)
- Update references to use `khoj` pypi package in docs and code
- Update pypi workflow to publish to both khoj, khoj-assistant for now
- Update stale python 3.9 support mentioned in our pyproject
   Can't support python 3.9 as depend on [Django 5.0.7](https://pypi.org/project/Django/5.0.7/) which needs python >=3.10

### Verify
- Updated `pypi.yml` github workflow publishes to both (new) [khoj](https://pypi.org/project/khoj/1.16.1.dev16/), (old) [khoj-assistant](https://pypi.org/project/khoj-assistant/1.16.1.dev16/) pypi projects
- Can install Khoj python package with `pip install khoj`
2024-07-17 01:05:51 -07:00
Debanjum
bf815e4463 Refactor Config API and Settings pages for Reuse and Consistency (#852)
### Major
- Reuse get config data logic across config pages on web client
- Make config api endpoint urls and response fields consistent
- Rename API path /api/config to /api/configure
- Move Web, Desktop client settings page to be under `/settings` from the previous `/config` url path

### Minor
- Pass isMobileWidth prop to SidePanel via chat share interface
- Turn prettier off instead of throwing error for now
- Do no explicitly add line-clamp plugin as it's in Tailwind by default
2024-07-17 01:03:06 -07:00
Debanjum Singh Solanky
a1c362a4f7 Expose web, desktop settings page under /settings, not /configure
- Update references to the settings page to use new url across docs
  and code
- Rename desktop and web settings page to settigns.html instead of
  config[ure].html
2024-07-17 13:17:29 +05:30
Debanjum Singh Solanky
b015b0e83d Arrange config API detailed response fields to improve readability
There are a lot of fields being returned. Group returned fields and
add comment header to each Group for readability
2024-07-17 13:17:28 +05:30
Debanjum Singh Solanky
71ebf31a54 Make config API detailed response fields more intuitive, consistent
- Use name, id for every [search|chat|voice|pain]_model_option
- Rename current_model_state field to more intuitive enabled_content_source
- Update references to the update fields in config.html
2024-07-17 12:41:01 +05:30
Debanjum Singh Solanky
30d60aaae9 Add, fix Khoj Docker container labels 2024-07-17 10:41:17 +05:30
Debanjum Singh Solanky
583fa3c188 Migrate the pypi package to khoj project name. Update references
- Deprecate khoj-assistant pypi package. Use more accurate and
  succinct pypi project name, khoj
- Update references to sye khoj pypi package in docs and code instead
  of the legacy khoj-assistant pypi package
- Update pypi workflow to publish to both khoj, khoj-assistant for now
- Update stale python 3.9 support mentioned in our pyproject. Can't
  support python 3.9 as depend on latest django which support >=3.10
2024-07-17 10:41:16 +05:30
Debanjum Singh Solanky
7316e6b9d3 Pass isMobileWidth prop to SidePanel via chat share interface 2024-07-16 16:13:27 +05:30
Debanjum Singh Solanky
4759c4ac96 Turn prettier off instead of throwing error for now
Until web interface code is reformatted with prettier
2024-07-16 16:13:27 +05:30
Debanjum Singh Solanky
466ef3f8f1 Do no explicitly add line-clamp plugin as it's in Tailwind by default 2024-07-16 16:13:27 +05:30
Debanjum Singh Solanky
59000a47cb Move Desktop config page to /configure from /config url path
Update references to point to page at /configure instead of /config
2024-07-16 16:13:27 +05:30
Debanjum Singh Solanky
a5c16ad600 Move Web client config page to /configure from /config url path
Update docs, clients and error messages to point to /configure
instead of /config
2024-07-16 16:13:27 +05:30
Debanjum Singh Solanky
de15a7a3fc Rename API path /api/config to /api/configure
- Update clients calling /api/config to call /api/configure instead
2024-07-16 16:13:27 +05:30
Debanjum Singh Solanky
dd31936746 Make config api endpoint urls consistent
- Consistently use /content/ for data. Remove content-source from path
- Remove unnecessary /data/ prefix for API endpoints under /config
2024-07-16 16:13:27 +05:30
Debanjum Singh Solanky
e8176b41ef Reuse get config data logic across config pages on web client
- Put logic to get config data, detailed or basic into router helpers module
- Use the get config func across the config pages on web clients

- Put configure content and get_notion_auth_url funcs in router helper
  module to avoid circular import
2024-07-16 16:13:27 +05:30
sabaimran
1a5405e24c Fix interpretation of day of week in automation form 2024-07-16 10:12:30 +05:30
sabaimran
c837f3779e Update the agents page with new UX (#850)
- Use icons/colors for setting the styling of agents
- Update automations page to use the shadcn cards: https://github.com/shadcn-ui/ui
2024-07-16 10:10:55 +05:30
sabaimran
1c6ed9bc6d Migrate the existing automations page to use React (#849)
Migrates the Automations page to React, mostly keeping the overall design consistent with organization. Use component library, with some changes in color. Add easier management with straightforward form and editing experience.
Use system preference for determining dark mode if not explicitly set.
2024-07-15 21:42:33 +05:30
Debanjum
c7764c7470 Fix, Improve Behavior, Styling of Chat View on Web (#851)
### Behavior
- Close chat sessions side panel on click open a chat session
- Show agent profile card with description when hover on agent in chat view
- Show action bar on last chat message without hover
- Show chat message action buttons without hover on mobile interfaces
- Show chat message timestamp on hover in chat view
- Show text descriptions of chat message action buttons on hover
- Render inline png, webp images generated by Khoj in chat view

### Fixes
- Do not render references with broken links in chat view
- Fix closing side panel on mobile when click open a chat session
- Only open side panel as drawer in mobile view
- Constrain chat messages to stay within view port across screen sizes

### Styling: Spacing, Sizing, Mobile Friendly
- Make Khoj icon appropriately sized and side panel arrow bold
- Conversations list should resize to take max space on side panel
- Make loading message, styling configurable. Do not show agent when no data
- Improve Train of Thought icons spacing and loading circle
- Improve mobile friendly styling of chat session side panel
- Improve styling of chat input, references UI across screen sizes
- Center cursor in chat input. See upto 2 lines for multi-line context

### Miscellaneous
- Add code formatter for web interface with prettier
2024-07-15 08:39:14 -07:00
Debanjum Singh Solanky
6c630bc6c3 Constrain chat messages to stay in view port across screen sizes
- Constrain chat messages max width to view port across screen sizes
- Wrap references on smaller screens, use tailwind, not js to apply styling
2024-07-15 21:00:50 +05:30
sabaimran
9a5bf4c701 Fix rendering of teaser reference panel in mobile width 2024-07-15 19:40:55 +05:30
sabaimran
2e9275c0f3 Remove side panel padding in desktop view. Fix width in mobile view 2024-07-15 19:33:12 +05:30
Debanjum Singh Solanky
ba0ba6b59f Merge branch 'features/big-upgrade-chat-ux' of github.com:khoj-ai/khoj into document-styling-on-chat-ux 2024-07-15 10:42:56 +05:30
Debanjum
23f61d49e0 Support syncing, searching images from Obsidian plugin (#847)
- Sync images from Obsidian vault with Khoj server now that Khoj can OCR images
- Support rendering images returned by Khoj search modal
2024-07-14 20:41:39 -07:00
Debanjum Singh Solanky
6f8f846086 Standardize code format for web interface with prettier
Use husky, lint-staged to run prettier pre-commit
2024-07-15 00:34:54 +05:30
sabaimran
06dce4729b Make most major changes for an updated chat UI (#843)
- Updated references panel
- Use subtle coloring for chat cards
- Chat streaming with train of thought
- Side panel with limited sessions, expandable
- Manage conversation file filters easily from the side panel
- Updated nav menu, easily go to agents/automations/profile
- Upload data from the chat UI (on click attachment icon)
- Slash command pop-up menu, scrollable and selectable
- Dark mode-enabled
- Mostly mobile friendly
2024-07-14 23:18:06 +05:30
Debanjum Singh Solanky
6dd90931e8 Fix closing side panel on mobile when click open a chat session 2024-07-14 22:54:49 +05:30
Debanjum Singh Solanky
47b754c07b Only open side panel as drawer in mobile view 2024-07-14 14:08:41 +05:30
Debanjum Singh Solanky
b47f30ad77 Make Khoj icon appropriately sized and side panel arrow bold 2024-07-14 14:06:36 +05:30
Debanjum Singh Solanky
e6b21144e2 Conversations list should resize to take max space on side panel 2024-07-14 13:49:36 +05:30
Debanjum Singh Solanky
c2bf405489 Make loading message, styling configurable. Do not show agent when no data
- Pass Loading message, class name via props to both inline and normal
  loading spinners
- Pass loading conversation message to loading spinner when chat
  history is being fetched
2024-07-14 13:00:36 +05:30
Debanjum Singh Solanky
63719747cb Show agent profile card with description when hover on agent in chat view
- Create profile card componennt. Use it for agent profile card
- Pass agent persona from khoj server via API
- Put link to agent profile page in the hover card to make it 2 clicks
  away. Othewise inadvertent clicks on agent in chat view lead away to
  agent page
- Use tailwind line-clamp extension to clamp card to first two lines
2024-07-14 12:20:11 +05:30
Debanjum Singh Solanky
dbbd4b9777 Show action bar on last chat message without hover 2024-07-14 10:32:31 +05:30
Debanjum Singh Solanky
a0f38e079f Improve Train of Thought icons spacing and loading circle 2024-07-14 09:35:15 +05:30
Debanjum Singh Solanky
e9567741eb Improve mobile friendly styling of chat session side panel 2024-07-14 00:57:08 +05:30
Debanjum Singh Solanky
b26a6e25d1 Show chat message action buttons without hover on mobile interfaces
This is because hover maybe hard to do on mobile devices
2024-07-14 00:54:23 +05:30
Debanjum Singh Solanky
f69f9e3523 Close chat sessions side panel on click open a chat session 2024-07-14 00:53:16 +05:30
Debanjum Singh Solanky
d51011314f Improve styling of chat input, references UI across screen sizes
Use tailwind screen breakpoints shorthand instead of js to apply
different styling for different screen sizes
2024-07-13 20:45:34 +05:30
Debanjum Singh Solanky
2668e42e7f Center cursor in chat input. See upto 2 lines for multi-line context
- Reuse class name when get slash command icons
- Previous chat input styling didn't have the cursor centered in the
  chat input text area. But it did allow seeing multi line chat inputs
  for context
2024-07-13 02:51:29 +05:30
Debanjum Singh Solanky
aeaebfb515 Show chat message timestamp on hover in chat view 2024-07-13 02:51:19 +05:30
Debanjum Singh Solanky
e00c6b486e Add hover text descriptions of action buttons on chat message in web view 2024-07-12 15:40:51 +05:30
Debanjum Singh Solanky
5fccccfdff Do not render references with broken links in chat view 2024-07-12 15:14:11 +05:30
Debanjum Singh Solanky
b98a0cfe1b Render inline png, webp images generated by Khoj in chat view
Add spacing between chat message paragraphs
2024-07-12 15:13:19 +05:30
sabaimran
3e7e73ddd6 Switch from using dynamic routes to static routes and extracting slug from URL manually. See https://github.com/vercel/next.js/discussions/64660 for limitations with static export / dynamic routes 2024-07-11 23:06:27 +05:30
sabaimran
bea0aa5445 Improve the logged out share experience 2024-07-11 20:11:21 +05:30
Debanjum Singh Solanky
02658ad4fd Upgrade Django version 2024-07-11 16:35:10 +05:30
Debanjum Singh Solanky
cbae8b68fb Add DB migration from making bi_encode configs optional in #834 2024-07-11 16:33:31 +05:30
Debanjum Singh Solanky
3a75838196 Add Keyboard shortcuts to navigate in Khoj Desktop 2024-07-11 16:29:53 +05:30
Debanjum Singh Solanky
6c1861b319 Improve the prompt to generate images with DALLE3 and SD3
- Major
  - Ask for prompt in prose
  - Remove seed from SD3 image generation to improve diversity of output
    for a given prompt
    Otherwise for conversations with similar sounding
    prompts, the images would be almost exactly the same. This maybe
    another indicator of SD3's inability to capture detailed
    instructions
  - Consistently use "prompt" wording instead of "query" in improved
    image generation prompts.
    Previously a mix of those terms were being used, which could confuse
    the chat model

- Minor
  - Add day of week to prompt
  - Remove 2-5 sentence limit on instructions to SD3. It seems to be
    able to follow longer instructions just with less fidelity than
    DALLE. And the 2-5 sentence instruction limit wasn't being adhered to
  - Improve ability to edit, improve the image based on follow-up
    instructions by the user
  - Align prompts for DALLE and SD3. Only difference is to wrap text to
    be rendered in quotes for SD3. This improves it's ability to render
    requested text. DALLE cannot render text as well or consistently
2024-07-11 16:29:53 +05:30
Debanjum Singh Solanky
21fe1a917b Support syncing, searching images from Obsidian plugin 2024-07-11 16:22:31 +05:30
sabaimran
6f1d799759 Modularize code and implemenet share experience 2024-07-10 23:08:16 +05:30
sabaimran
1b4a51f4a2 Remove print statement for debugging timestamps 2024-07-10 14:54:22 +05:30
sabaimran
0369eb6e0e Fix timestamp bug for pending message and expand CSP for thumbnails 2024-07-10 14:53:31 +05:30
sabaimran
375685530f Add content security policy to the chat page 2024-07-10 11:18:41 +05:30
sabaimran
c5cfd0f2cf Remove unused slash command-related useeffect hook 2024-07-10 10:03:58 +05:30
sabaimran
e1a5c17775 Add DOMPurify for rendering md text. Add a easter egg in the console 2024-07-10 10:03:08 +05:30
sabaimran
e358723baa Fix image rendering and unique key for pending message? 2024-07-09 21:55:54 +05:30
sabaimran
c8c5d50b1a Improve command bar slash experience 2024-07-09 21:39:13 +05:30
sabaimran
c25bf97831 Update hover styling for see all button 2024-07-09 20:55:54 +05:30
sabaimran
23b71b0dff Remove shadow from the slash command bar 2024-07-09 20:52:38 +05:30
sabaimran
998e2aec30 Update dark mode, fix chat message time stamp, fix rendering for new message 2024-07-09 20:50:20 +05:30
sabaimran
0c6b6de09e Revert web client route chat page rendering logic 2024-07-09 19:47:04 +05:30
sabaimran
cc22e1b013 Add pop-up module for the slash commands 2024-07-09 19:46:17 +05:30
sabaimran
5b69252337 Add hover effects for chat messages 2024-07-09 14:56:57 +05:30
sabaimran
a0e9530fa4 Merge branch 'master' of github.com:khoj-ai/khoj into features/chat-ui-updates-big 2024-07-09 12:57:50 +05:30
sabaimran
260aa61818 Remove tests for python3.9 2024-07-09 12:28:11 +05:30
sabaimran
4471c1e37f Apply mitigations for piling up open connections
- Because we're using a FastAPI api framework with a Django ORM, we're running into some interesting conditions around connection pooling and clean-up. We're ending up with a large pile-up of open, stale connections to the DB recurringly when the server has been running for a while. To mitigate this problem, given starlette and django run in different python threads, add a middleware that will go and call the connection clean up method in each of the threads.
2024-07-09 12:22:58 +05:30
sabaimran
609e7ee19c Fix width of side panel 2024-07-09 12:02:01 +05:30
Debanjum
0b1b262512 Add system dependencies required by RapidOCR to fix Khoj Docker image (#842)
- Issue
The Khoj docker build would fail with `ImportError: libGL.so.1: cannot open shared object file: No such file or directory`. This was required by the Khoj RapidOCR python package dependency. 

- Fix
A minimal set of system packages have been added to resolve this issue.
2024-07-08 22:16:16 +05:30
kxnarak
43413cd21f add dependencies required by the RapidOCR python package 2024-07-08 18:26:19 +05:30
sabaimran
bf4c2f219e Merge branch 'master' of github.com:khoj-ai/khoj into features/chat-ui-updates-big 2024-07-08 17:00:42 +05:30
sabaimran
037e157648 Fix a variety of links 2024-07-08 16:49:13 +05:30
sabaimran
6b80bb3f37 Add a demo for the khoj mini application, minor updates to other pages, remove out of date demos page 2024-07-08 16:33:47 +05:30
Debanjum Singh Solanky
9e31ebff93 Release Khoj version 1.16.0 2024-07-07 18:26:10 +05:30
Debanjum Singh Solanky
54132efd67 Fix Khoj Obsidian plugin build 2024-07-07 18:26:10 +05:30
Debanjum Singh Solanky
510d9b3a29 Add short keys to open chat menu, new chat, search from Obsidian pane 2024-07-07 17:57:17 +05:30
Debanjum Singh Solanky
3e0c882e27 Transcribe only when keyboard shortcut or button pressed in Obsidian
- Transcribe on holding Ctrl+s keyboard shortcut
- Transcribe on holding the transcribe button pressed via mouse too
- Make the transcribe button robust to inadvertent touches by using timeout
- Do not transcribe, trigger auto-send on silences. Silence detection
  is super rudimentary, just blocks standard emanations by whisper
  when no speech
2024-07-07 17:57:17 +05:30
sabaimran
0eb000c3ea Add health checks for the django ORM 2024-07-07 16:11:28 +05:30
sabaimran
6f8a65c529 References, mobile friendly chat sessions and file filter 2024-07-07 15:42:29 +05:30
Debanjum Singh Solanky
a31cd0dec1 Fix async batch delete of indexed entries 2024-07-06 22:45:26 +05:30
Debanjum
08b379c2ab Fix, Improve Indexing, Deleting Files (#840)
### Fix
- Fix degrade in speed when indexing large files
- Resolve org-mode indexing bug by splitting current section only once by heading
- Improve summarization by fixing formatting of text in indexed files

### Improve
- Improve scaling user, admin flows to delete all entries for a user
2024-07-06 19:52:42 +05:30
Debanjum Singh Solanky
4a471979eb Upgrade sentence-transformer package to version 3.0.1
Add einops dependency for some sentence transformer models like the
nomic-embed
2024-07-06 19:35:59 +05:30
Debanjum Singh Solanky
d693baccbc Make it optional to set the encoder, cross-encoder configs via admin UI 2024-07-06 19:35:59 +05:30
Debanjum Singh Solanky
1baebb8d0e Identify markdown headings by any whitespace character after ^#+
Previously only markdown headings with space characters after # would
be considered a heading. So ^##\t wouldn't be considered a valid heading
2024-07-06 19:35:59 +05:30
Debanjum Singh Solanky
010486fb36 Split current section once by heading to resolve org-mode indexing bug
- Split once by heading (=first_non_empty) to extract current section body
  Otherwise child headings with same prefix as current heading will
  cause the section split to go into infinite loop
- Also add check to prevent getting into recursive loop while trying
  to split entry into sub sections
2024-07-06 19:35:59 +05:30
Debanjum Singh Solanky
6a135b1ed7 Fix degrade in speed of indexing large files. Improve summarization
Adding files to the DB for summarization was slow, buggy in two ways:
- We were updating same text of modified files in DB = no of chunks
  per file times

- The `" ".join(file_content)' code was breaking each character in the
  file content by a space. This formats the original file content
  incorrectly before storing in the DB

Because this code ran in the main file indexing path, it was slowing down
file indexing. Knowledge bases with larger files were impacted more strongly
2024-07-06 19:35:59 +05:30
Debanjum Singh Solanky
e6ffb6b52c Improve scaling user flow to delete all entries
- Delete entries by batch to improve efficiency of query at scale
- Share code to delete all user entries between it's async, sync methods
- Add indicator to show when files being deleted on web config page
2024-07-06 19:35:59 +05:30
Debanjum Singh Solanky
1ab59865b5 Improve scaling admin flow to delete all entries for user 2024-07-06 19:35:59 +05:30
Debanjum
05138cbd0a Use DOM Scripting, Add CSP to Web config pages. Disable CSP in Obsidian plugin (#834)
- Add CSP to web config pages. Load phone no. validation js, css from S3
- Construct config page elements on Web via DOM scripting
- Disable CSP in Khoj Obsidian as it interferes with Obsidian functionality

- Other miscellaneous voice message level improvements (rate limit, listening animation)
2024-07-06 19:30:09 +05:30
Debanjum Singh Solanky
9bdb48807b Ratelimit text to speech model. Validate share chat url domain
- Do not log auth error message on server when Resend setup as Magic
  links for sign-in are now supported
2024-07-06 12:53:19 +05:30
Debanjum Singh Solanky
b334db0fca Add CSP to web config pages. Load phone no validation js, css from S3 2024-07-06 12:48:28 +05:30
Debanjum Singh Solanky
2f034f807a Construct config page elements on Web via DOM scripting.
Minimize isage of innerHTML to prevent DOM clobbering and unintended
escape by user Input
2024-07-06 12:48:28 +05:30
Debanjum Singh Solanky
69c9e8cc08 Disable CSP in Khoj Obsidian as it interferes with Obsidian functionality
The Khoj CSP interferes with other Obsidian features and plugins as
CSP is applied page wide.

For now chat message sanitization via Dompurify should suffice.

Enable CSP when can scope it to only the Khoj Obsidian plugin.
2024-07-05 16:10:08 +05:30
Debanjum Singh Solanky
a353d883a0 Make it optional to set the encoder, cross-encoder configs via admin UI
Upgrade sentence-transformer, add einops dependency for some sentence
transformer models like nomic
2024-07-05 16:09:30 +05:30
Debanjum Singh Solanky
6d59ad7fc9 Add listening circle animation to speak button in Obsidian plugin
Use icon active focus as color of animation button
2024-07-05 14:00:53 +05:30
sabaimran
aec44a0b89 Add dark mode toggle! And improve experience for train of thought 2024-07-04 18:29:21 +05:30
Debanjum Singh Solanky
516af86575 Fix add, remove of the text to speech loader element in Obsidian 2024-07-04 17:38:45 +05:30
sabaimran
465ef0b772 Add a loading experience when waiting for khoj response 2024-07-04 13:49:51 +05:30
Debanjum Singh Solanky
814aca6d69 Skip summarize when not triggered via slash cmd and can't summarize
Maybe better to fallback to non-summarize behavior if summarize intent
is just inferred but we can't actually summarize because the single
file added to conversation isn't satisfied
2024-07-04 13:31:00 +05:30
Debanjum
4446de00d3 Enable Voice, Keyboard Shortcuts in Khoj Obsidian Plugin (#837)
- Simplify quick jump between Khoj side pane and main editor view using keyboard shortcuts
- Enable voice chat in Obsidian to make interactions with Khoj more seamless
2024-07-04 13:28:29 +05:30
sabaimran
5ea8b16f84 Fix missing method error 2024-07-04 12:08:22 +05:30
sabaimran
d61bddf56c Fix retrieving image model by prefetching the openai config in the async method 2024-07-04 11:58:33 +05:30
sabaimran
a129b017b9 Fix image generation on server -- use default config when not set by user 2024-07-04 09:13:23 +05:30
sabaimran
34118078bf kill the emojis 2024-07-04 00:30:21 +05:30
sabaimran
d5ba916978 Working example of streaming, intersection observer, other UI updates 2024-07-04 00:30:01 +05:30
sabaimran
78d1a29bc1 Finish up filte filter side panel menu 2024-07-02 23:32:36 +05:30
sabaimran
6fa2dbc042 Do not use the custom configured max prompt size to send message to anthropic 2024-07-02 21:59:06 +05:30
sabaimran
8a6722ba97 Add basic implementation for chat side panel components 2024-07-02 21:56:43 +05:30
Debanjum Singh Solanky
afcfc60637 Merge DB migrations post merge of SD3 via API support PR 2024-07-02 17:54:58 +05:30
Debanjum
c015eeb5dd Improve Online Search: Parallelize Search, Use Jina Reader API by default (#832)
- Overview
  Khoj wil be able to do online search out of the box, even for self-hosted users
  - Default to Jina search, reader API when no Serper.dev, Olostep API keys
  - Run online searches in parallel to process multiple queries faster

- Details
  - Jina provides a [reader API](https://github.com/jina-ai/reader) for online search and web page reading
     It requires no API key. This provides a good default to enable 
     online search for self-hosted readers requiring no additional setup. 

  - Jina search API also returns webpage contents with the results, so
     just use those directly when Jina Search API used instead of
     trying to read webpages separately. The extract relevant content from
     webpage step using a chat model is still used from the
    `read_webpage_and_extract_content' func in this case.

  - Parse search results from Jina search API into same format as
     Serper.dev for accurate rendering of online references by clients

  - Run online searches in parallel with AsyncIO to process multiple queries faster
2024-07-02 17:44:51 +05:30
Debanjum
826c3dc9cc Enable using Stable Diffusion 3 for Image Generation via API (#830)
- Support Stable Diffusion 3 via API
  Server Admin needs to setup model similar to DALLE-3 via Django Admin Panel
- Use shorter prompt generator to prompt SD3 to create better images
- Allow users to set paint model to use from web client config page
2024-07-02 17:28:50 +05:30
Debanjum Singh Solanky
d5ceff2691 Update tests and documentation with Jina reader API usage and info
Update offline, openai chat actor, director tests to not require
Serper to run the online command tests

Update documentation for self-hosted online search to mention no setup
is required by default. But improvements can be made by using
Serper.dev or Olostep
2024-07-02 17:19:09 +05:30
Debanjum Singh Solanky
553beae848 No need to set OpenAI API key from environment variable explicitly
It is unnecessary as the OpenAI client automatically tries to use API
key from OPENAI_API_KEY env var when the api_key field is unset
2024-07-02 17:19:09 +05:30
Debanjum Singh Solanky
a038e4911b Default to Jina search, reader API when no Serper.dev, Olostep API keys
Jina AI provides a search and webpage reader API that doesn't require
an API key. This provides a good default to enable online search for
self-hosted readers requiring no additional setup.

Jina search API also returns webpage contents with the results, so
just use those directly when Jina Search API used instead of
trying to read webpages separately. The extract relvant content from
webpage step using a chat model is still used from the
`read_webpage_and_extract_content' func in this case.

Parse search results from Jina search API into same format as
Serper.dev for accurate rendering of online references by clients
2024-07-02 17:19:08 +05:30
Debanjum Singh Solanky
ff44734774 Run online searches in parallel to process multiple queries faster 2024-07-02 17:19:08 +05:30
sabaimran
0ee7cc8c47 Change overall architecure of how information is flowing for better statefulness 2024-07-02 12:39:54 +05:30
sabaimran
541ce04ebc Checkpoint: Updated sidebar panel with new components
- Add non-functional UI elements for chat, references, feedback buttons, rename/share session, mic, attachment, websocket connection
2024-07-02 11:18:50 +05:30
Raghav Tirumale
8eccd8a5e4 Support Indexing Images via OCR (#823)
- Added support for uploading .jpeg, .jpg, and .png files to Khoj from Web, Desktop app
- Updating indexer to generate raw text and entries using RapidOCR
- Details
  * added support for indexing images via ocr
  * fixed pyproject.toml
  * Update src/khoj/processor/content/images/image_to_entries.py
     Co-authored-by: Debanjum <debanjum@gmail.com>
  * Update src/khoj/processor/content/images/image_to_entries.py
     Co-authored-by: Debanjum <debanjum@gmail.com>
  * removed redudant try except blocks
  * updated desktop js file to support image formats
  * added tests for jpg and png
  * Fix processing for image to entries files
  * Update unit tests with working image indexer
  * Change png test from version verificaition to open-cv verification

---------

Co-authored-by: Debanjum <debanjum@gmail.com>
Co-authored-by: sabaimran <narmiabas@gmail.com>
2024-07-01 06:00:00 -07:00
Debanjum Singh Solanky
cffc14a46a Trigger voice chat via keyboard shortcut in Khoj side pane
Quickly trigger voice chat from Khoj side pane using Keyboard shortcuts
2024-07-01 18:06:09 +05:30
Debanjum Singh Solanky
3723904512 Toggle jump between Khoj side pane & previous editor via cmd, kbd shortcut
Improve quick navigation to, from Khoj side pane using Keyboard
shortcut or Obsidian command
2024-07-01 18:05:59 +05:30
Debanjum Singh Solanky
fbb95ca342 Put cursor on chat input when focus on chat view in Obsidian
This should improve fluidity of keyboard interactions with Khoj on
Obsidian.

Open Khoj chat view via keybinding or command pallete and ask
question using only the keyboard, with no mouse clicks required
2024-07-01 18:05:55 +05:30
Debanjum Singh Solanky
093e276908 Enable Voice chat in Khoj Obsidian plugin
- Automatically carry out voice chats with Khoj from within Obsidian
  When send voice message, Khoj will auto respond with voice as well
- Listen to past Khoj messages as speech
- Add circular loading spinner to use while message is being converted
  to speech
2024-07-01 18:02:28 +05:30
sabaimran
c83b8f2768 Allow just one worker to be the background schedule leader (#836)
* Add a leader election mechanism to circumvent runtime issues for multiple schedulers

- Reduce the load on the DB and risk of issues on the service side by limiting the execution environment to one elected leader at a given time. This one is responsible for managing all of the execution of the jobs, though all workers are capable of adding and removing jobs

* Set a max duration for the schedule leader position (12 hrs), add some error if automation not added successfully
2024-06-28 13:13:25 +05:30
sabaimran
80fe5ce182 Fix user not authenticated interpretation error 2024-06-27 21:13:54 +05:30
Raghav Tirumale
24a0d8b073 Add OS Level Shortcut Window for Quick Access to Khoj Desktop (#815)
* rough sketch of desktop shortcuts. many bugs to fix still

* working MVP of desktop shortcut khoj

* UI fixes

* UI improvements for editable shortcut message

* major rendering fix to prevent clipboard text from getting lost

* UI improvements and bug fixes

* UI upgrades: custom top bar, edit sent message and color matching

* removed debug javascript file

* font reverted to Noto Sans

* cleaning up the code and removing diffs

* UX fixes

* cleaning up unused methods from html

* front end for button to send user back to main window to continue conversation

* UX fix for window and continue conversation support added

* migrated common js functions into chatutils.js

* Fix window closing issue in macos by

1. Use a helper function to determine if the window is open by seeing if there's a browser window with shortcut.html loaded
2. Use the  event listener on the window to handle teardown

* removed extra comment and renamed continue convo button

---------

Co-authored-by: sabaimran <narmiabas@gmail.com>
2024-06-27 07:20:13 -07:00
sabaimran
870d9ecdbf Add a fact checker feature with updated styling (#835)
- Add an experimental feature used for fact-checking falsifiable statements with customizable models. See attached screenshot for example. Once you input a statement that needs to be fact-checked, Khoj goes on a research spree to verify or refute it.
- Integrate frontend libraries for [Tailwind](https://tailwindcss.com/) and [ShadCN](https://ui.shadcn.com/) for easier UI development. Update corresponding styling for some existing UI components. 
- Add component for model selection 
- Add backend support for sharing arbitrary packets of data that will be consumed by specific front-end views in shareable scenarios
2024-06-27 18:45:38 +05:30
sabaimran
3b7a9358c3 Add our first view via Next.js for Agents (#817)
Initialize our migration to use Next.js for front-end views via Agents. This includes setup for getting authenticated users, reading in available agents, setting up a pop-up modal when you're clicking on an agent, and allowing users to start new conversations with agents.

Best attempt at an in-place migration, though there are some noticeable differences.

Also adds view for chat that are not being used, but in experimental phase.
2024-06-27 13:56:16 +05:30
Debanjum Singh Solanky
afbeee9e82 Rename copy-button to more general chat-action-button in Obsidian client
- Use 4 space indent of activateView function in pane_view component
2024-06-26 18:09:23 +05:30
sabaimran
8c12a69570 Fix issue in anthropic chat when khoj message becomes top message
This is because Anthropic requires the first message in the chat history to be from the user.
2024-06-26 12:59:34 +05:30
Debanjum Singh Solanky
4f89319b40 Release Khoj version 1.15.0 2024-06-26 10:38:16 +05:30
Debanjum Singh Solanky
bbfd320ed4 Use Yarn instead of NPM to bump Desktop, Obsidian client versions 2024-06-26 10:37:58 +05:30
Debanjum Singh Solanky
c793d8a69e Add Validation logic to save PaintModel. Use API key from Paint Model
Rename Paint Model, Adapters to TextToImage for consistency
2024-06-26 10:16:26 +05:30
Debanjum Singh Solanky
1acf969c6e Do not require OpenAI to generate image as local chat + sd3 works now
Previously the text_to_image helper would only trigger the image
generation flow if OpenAI client was setup. This is not required
anymore as offline chat model + sd3 API works. So remove that check
2024-06-26 10:16:26 +05:30
Debanjum Singh Solanky
2c4bf91a61 Allow user to set paint model to use from web client config page 2024-06-26 10:16:26 +05:30
Debanjum Singh Solanky
eb09aba747 Remove quotes wrapping the prompt from being passed to image gen model 2024-06-26 10:16:26 +05:30
Debanjum Singh Solanky
fdd4c02461 Use shorter prompt generator to prompt SD3 to create better images 2024-06-26 10:16:26 +05:30
Debanjum Singh Solanky
eda33e092f Enable using Stable Diffusion 3 for Image Generation via API 2024-06-26 10:16:26 +05:30
Debanjum
a25689fabf Use user theme in Obsidian for Khoj plugin styling (#825)
Makes the Khoj chat in the Obsidian plugin adapt better to the user theme, making it feel more seamless, and helps with dark mode compatibility
2024-06-26 10:14:17 +05:30
Debanjum Singh Solanky
cfe46fd9f5 Add Border Color instead of BG Color for Chat Message in Obsidian 2024-06-26 08:11:04 +05:30
sabaimran
fb818ead60 Use active bg instead of code background for khoj response 2024-06-26 08:05:13 +05:30
sabaimran
a4b2552540 Update conversation session selection menu to use Obsidian theme colors as well 2024-06-26 08:05:13 +05:30
sabaimran
da5b07e913 Remove custom styling on the reference buttons 2024-06-26 08:05:13 +05:30
sabaimran
c4a1ae9375 Make the Khoj Obsidian plugin more user theme friendly
Use the CSS variables from the theme for the Khoj UI components
2024-06-26 08:04:17 +05:30
Debanjum Singh Solanky
d6fe5d9a63 Pass current component as arg to markdown renderer in chat view
This doesn't work on search modal, but hopefully will get resolved
once we migrate search into a view from a modal
2024-06-24 16:12:20 +05:30
Debanjum Singh Solanky
0d04018622 Install pydantic with optional email validator package
Otherwise Khoj fails on startup. Not sure why, must be new changes to
pydantic?
2024-06-24 16:12:20 +05:30
Debanjum Singh Solanky
6f280b1ccc Split setup of specific OpenAI API proxies into separate doc pages 2024-06-24 16:12:20 +05:30
Debanjum Singh Solanky
68e7c297e0 Add Advanced Self Hosting Section, Improve Self Hosting, OpenAI Proxy Docs
- Add instructions for self-hosted users with info, warning boxes to
  avoid, fix common issues when setting up Khoj server
- Create new Advanced Self Hosting section
  - Extract Advanced Self-Hosting Sections from the Advanced Page and
    move them to separate Pages under Advanced Self Hosting section
- Improve OpenAI Proxy Docs
  - Put Ollama setup as a section under OpenAI API Proxy page instead
    of a separate page
  - Add Section to use Khoj with chat model from LM Studio
  - Update LiteLLM docs to use chat model from LM Studio
2024-06-24 16:12:20 +05:30
Debanjum Singh Solanky
732332a3c5 Spell fix s/e.g/e.g./ across code, tests and docs 2024-06-24 15:24:45 +05:30
Debanjum Singh Solanky
8fc7f980aa Revert KHOJ_DOMAIN to only support single domain.
Multiple domain support didn't generalize to other portions where it
is used
2024-06-24 15:24:45 +05:30
sabaimran
4110e71e84 Add info in the documentation about text to speech 2024-06-24 12:46:33 +05:30
sabaimran
939811e9b5 Fix conversation look up logic 2024-06-24 09:10:03 +05:30
Debanjum Singh Solanky
a4d88612c1 Just use yarn for package version locking. Remove npm package lock 2024-06-23 16:06:20 +05:30
Debanjum Singh Solanky
55be90cdd2 Sanitize user input fields on Automations page of web client
Use Dompurify to sanitize user input
2024-06-23 14:14:47 +05:30
Debanjum Singh Solanky
1c7a562880 Generate automation cards via DOM scripting 2024-06-23 13:22:38 +05:30
Debanjum Singh Solanky
57a36967bf Run Obsidian version script in bump_version.sh to write to versions.json
This handles updates from manifest.json minAppVersion field to the
versions.json file.

The minAppVersion field is for the minimum Obsidian app version
supported by a Khoj plugin version
2024-06-23 08:18:55 +05:30
Debanjum Singh Solanky
c7c32a7467 Improve online chat reference extraction in Khoj.el Emacs package
- Handle online references with no title
- Improve handling references which are arrays instead of lists
2024-06-23 08:13:36 +05:30
Debanjum Singh Solanky
9d33d8c0fa Upgrade typescript eslint dev dependency of Khoj Obsidian plugin 2024-06-23 07:36:49 +05:30
Debanjum
a94062469a Automatically Find Similar Notes on Emacs in Background (#827)
Khoj will find and display notes similar to the current entry in the side pane when
1. find similar is open in side pane and
2. cursor has moved to a new entry

### Major
- Find similar notes to current note at cursor automatically in background
- Only show headings of search result and increase default results count

### Minor
- Pass absolute path of file to index from khoj.el emacs client
- Update help message to only show the smaller set of new keybindings
- Fix edge cases in loading some chat sessions
2024-06-23 07:36:11 +05:30
sabaimran
38090b2553 In dockerize.yml file, revert the added configuration 2024-06-22 21:11:25 +05:30
sabaimran
a53178cab9 Add developer support for using next.js to serve generated static files (#814)
To improve the developer experience for front-end development, we're migrating to Next.js. In order to do this migration page-by-page, we're using static site generation via Next.js. This also helps us avoid making cross site requests from front-end to back-end for the time being, while giving a ramp to separating out server and client if needed for scale down the road.

Dev instructions for using the next.js setup are in the added README.

This adds scaffolding for including the built files in the python package as well as the docker images. Docker setup has been tested locally. In order to verify the build is working as expected, we can navigate to the {khoj_host}:42110/experimental and verify that the experiment page comes up.

This setup works with serving static files included in the src/interface/web folder from the Django app. The key bit for understanding the setup is in the yarn export command in package.json.
2024-06-22 20:12:41 +05:30
Debanjum Singh Solanky
59edb99f04 Simplify, improve bump version development script
- Just use in-built `npm version' command to update desktop, obsidian version
- Upgrade by major, minor or patch version using new -t flag in script
  E.g bump_version -t minor
2024-06-22 18:19:38 +05:30
Debanjum Singh Solanky
abd6f58aee Upgrade Desktop app package dependencies 2024-06-22 17:38:52 +05:30
Debanjum Singh Solanky
f413dc62cd Upgrade Obsidian plugin dependencies. Add package lock file for it
Add it to bump_version script as well.
2024-06-22 17:38:52 +05:30
Debanjum Singh Solanky
1d7d51a7ab Upgrade Documentation packages 2024-06-22 17:38:48 +05:30
Debanjum Singh Solanky
22f6db0a6b Upgrade RapidOCR and enable for Python 3.12. Fix PDF OCR test 2024-06-22 16:01:55 +05:30
Debanjum Singh Solanky
55a23eae25 Upgrade pillow to fix pytest workflow failure 2024-06-22 15:17:43 +05:30
Debanjum Singh Solanky
7e277e9381 Fix getting file-toggle-button element in chat of web app 2024-06-21 15:54:38 +05:30
Debanjum Singh Solanky
fa7b40ab86 Automatically respond with Voice if subscribed user sent Voice message 2024-06-21 15:53:01 +05:30
Debanjum Singh Solanky
5e5fe4b7af Improve font size, spacing of conversation session on desktop app 2024-06-21 12:25:35 +05:30
sabaimran
d3c0111121 Include base URL when using openai api config in extract questions. Close #831 2024-06-21 12:18:50 +05:30
sabaimran
b9966eb3d4 Add support for text to speech in chat responses (#821)
* Enable speech to text responses in khoj chat

- Current issue: reads out all the markdown formatting, plus waits for the whole result to be streamed before playing it

* Extract content from markdown-formatted text

* Add a loader for while you're waiting for Khoj's response

* Add user configuration option for chat model options, allow server side configuration for option list

* Join up APIs, views, admin pages to allow configuring custom voice models
2024-06-21 11:30:28 +05:30
Debanjum Singh Solanky
427575e958 Improve khoj chat new, delete session flows
When create new conversation session, automatically request query. As
that is expected next action after creating new session

Pass session-id to khoj-chat to allow reuse from
create-new-conversation func

When delete conversation session, do not call load chat session.
Unnecessary action.

Use thread-last to improve code flow in new, delete conversation funcs
2024-06-21 10:54:59 +05:30
Debanjum Singh Solanky
59032a06d5 Improve defaults when extracting fields from online reference in khoj.el 2024-06-21 10:54:59 +05:30
Debanjum Singh Solanky
9262aea7a5 Fix comments, func calls based on melpazoid, checkdoc, package-lint 2024-06-21 10:54:59 +05:30
sabaimran
ff26b19d2b Add a migration for allowing the docx field in the entries file type 2024-06-21 09:47:49 +05:30
sabaimran
3cfe5aabe5 Add support for magic link email sign-in (#820)
* Add magic link email sign-in option

* Adding backend routes and model changes to keep state of email verification code and status

* Test and fix end to end email verification flow

* Add documentation for how to use the magic link sign-in when self-hosting Khoj

* Add magic link sign in to public conversation page
2024-06-20 13:32:58 +05:30
Debanjum Singh Solanky
0afe66ac39 Restore cursor to original window after opening Khoj side pane
Previously the cursor would move to the Khoj side pane on opening it.
This would break user's flow, especially when find similar triggers
automatically

New behavior maintains smoother update of auto find similar without
disrupting user browsing
2024-06-20 12:50:13 +05:30
Debanjum Singh Solanky
afe91a2633 Only show headings of search result and increase total count returned
Previously it would show complete result body this would make the
result width variable and hard to track all the returned results

Showing just heading makes it easier to track
2024-06-20 12:50:13 +05:30
Debanjum Singh Solanky
2b12a5514e Find similar notes to current note at cursor automatically in background
- Call find similar on current element if point has moved to new
  element
- Delete the first result from find-similar search results as that'll
  be the current note (which is trivially most similar to itself)
- Determine find-similar based text formating at the rendering layer
  rather than at the top level find-similar func
2024-06-20 12:50:13 +05:30
Raghav Tirumale
093eb473cb Add Documentation for the /summarize Command (#822)
* added documentation for the /summarize command

* Add a hint for natural language usage

---------

Co-authored-by: sabaimran <narmiabas@gmail.com>
2024-06-20 12:08:01 +05:30
Raghav Tirumale
bd3b590153 Support Indexing Docx Files (#801)
* Add support for indexing docx files and associated unit tests

---------

Co-authored-by: sabaimran <narmiabas@gmail.com>
2024-06-20 11:18:01 +05:30
Debanjum Singh Solanky
d042e073cc Pass absolute path of file to index from khoj.el emacs client 2024-06-20 00:26:18 +05:30
Debanjum Singh Solanky
d23f2849d4 Update help message to only show the smaller set of new keybindings 2024-06-20 00:26:18 +05:30
Raghav Tirumale
d4e5c95711 Add Ability to Summarize Documents (#800)
* Uses entire file text and summarizer model to generate document summary.
* Uses the contents of the user's query to create a tailored summary.
* Integrates with File Filters #788 for a better UX.
2024-06-18 19:31:07 +05:30
Debanjum Singh Solanky
677d49d438 Release Khoj version 1.14.0 2024-06-18 17:13:46 +05:30
Debanjum Singh Solanky
2930b57c78 Use hashed value to improve deduplication of search results on server 2024-06-18 17:04:25 +05:30
Debanjum Singh Solanky
6814dadd21 Fix opening Web, Desktop setup links on first run from Desktop app
Previous version failed to open the setup links
2024-06-18 17:04:25 +05:30
Debanjum Singh Solanky
632f55a9e8 Do not default to rerank if device has GPU 2024-06-18 17:04:25 +05:30
Debanjum Singh Solanky
f1120f24a1 Use solarized light css styling to highlight code in chat messages 2024-06-18 17:04:25 +05:30
Debanjum Singh Solanky
d8a5a01cea Pass multiple allowed Khoj domains via KHOJ_DOMAIN env var
To add multiple allowed Khoj domains pass them as a comma separated
list of domains via the KHOJ_DOMAIN environment variable

Resolve comment in issue #662
2024-06-18 17:04:25 +05:30
Debanjum Singh Solanky
4daf16e5f9 Only redirect to next url relative to current domain 2024-06-18 17:04:25 +05:30
Debanjum Singh Solanky
86a3505d89 Remove image HTML elements from non whitelisted sources in Obsidian chat
Given img src enforcement via CSP required loosening. Soft enforce it
via a regex replace of img HTML elements if the src isn't from the
whitelisted set of source prefixes.

Currently allowed source prefixes are
- app: for local images
- data: for inline generated images
- https://generated.khoj.dev: for cloud generated images
2024-06-18 17:04:25 +05:30
Debanjum Singh Solanky
c7d825bddb Sanitize markdown in Obsidian after conversion to HTML too
- Create and use a function to convert markdown to sanitized html
- Remove unused Latex delimiter handling as Katex isn't used in
  Khoj chat on Obsidian
2024-06-18 17:04:25 +05:30
Debanjum Singh Solanky
08c3aa496d Loosen CSP in Obsidian to load images, sync and allow Obsidian domain 2024-06-18 17:04:25 +05:30
sabaimran
327045be43 Make some basic updates to the chat documentation. Inc. conversation file filters, new screenshot 2024-06-18 12:14:59 +05:30
sabaimran
76e1bed8f9 Update Obsidian documentation 2024-06-18 08:22:10 +05:30
sabaimran
a57e1e7a14 Fix langchain, tenacity versions 2024-06-17 14:52:11 +05:30
sabaimran
ce9c14f894 Fix more packages related to langchain in the pyproject.toml 2024-06-17 14:38:05 +05:30
sabaimran
ba0187798a Get converastion id before retrieving relevant notes in non-socket code 2024-06-17 14:26:06 +05:30
Debanjum
d2d9f4888e Upgrade Khoj Emacs UX (#812)
- Open Khoj in Emacs Side pane
   Open Khoj chat, search in right pane to allow for ambient engagement
- Improve Khoj Chat
  - Show online references used for chat
  - Make chat API call async to not block user interactions
  - Fix loading chat history, references in khoj.el chat buffer
- Improve Khoj Search, Find Similar functions
   - Make calls to Khoj search API async to not block user interactions
- Support Conversation Sessions
  - Create transient menu to open, create, delete conversation sessions from the Khoj Emacs client
2024-06-16 10:39:48 +05:30
Debanjum Singh Solanky
fe36adb7b9 Remove short keys to switch content type during search to avoid conflict
- C-x o to switch to search org content conflicts with switch buffer shortkey
  This is more apparent in the async search scenario as it prevents
  perform other actions while async search is in progress

- Also switching content type wouldn't scale to all the content types
  Khoj will support without causing more conflicting keybinding
2024-06-15 17:31:19 +05:30
Debanjum Singh Solanky
2a84524d19 Make khoj.el search, similar API calls async to not block user interactions 2024-06-15 17:30:58 +05:30
Debanjum Singh Solanky
c6b95f8776 Handle rendering messages using the old reference schema in khoj.el
Previously references were a list instead of a map
2024-06-15 17:30:58 +05:30
Debanjum Singh Solanky
db056c896d Delete old conversation sessions from the chat menu in Khoj Emacs 2024-06-15 17:30:58 +05:30
Debanjum Singh Solanky
e3d995a74f Extract select conversation session logic into func for reusability 2024-06-15 17:30:38 +05:30
Debanjum Singh Solanky
e15dc23bbe Improve logic to create vs reuse window for khoj side pane logic
Khoj side pane occupies a vertically split bottom right side pane.
If the bottom right window is not a vertical split, create a new
vertical split pane for khoj, otherwise reuse the existing window
2024-06-15 16:37:41 +05:30
Debanjum Singh Solanky
055e5e8d26 Create new conversation from the chat menu in Khoj Emacs 2024-06-15 16:37:41 +05:30
Debanjum Singh Solanky
c33954cd93 Fix loading an empty chat session in Emacs 2024-06-15 16:37:41 +05:30
Debanjum Singh Solanky
e21c0648ae Create, use reusable function to call Khoj API from elisp 2024-06-15 16:37:41 +05:30
Debanjum Singh Solanky
7bcb49b6e7 Support conversation sessions in the Khoj Emacs client
Add option in khoj main transient menu option to open menu to
- switch between existing conversations
2024-06-15 13:13:20 +05:30
Debanjum Singh Solanky
df9c5ff263 Show online references used for chat response as footnotes in Emacs
Previously online references used weren't being shown
2024-06-15 13:13:19 +05:30
sabaimran
82f37971c5 Fix broken link in automations.md 2024-06-14 16:22:27 +05:30
sabaimran
25d8cdd9cd Misc fixes:
- Fix getting file filters for not found conversations
- Allow iamge rendering in automation emails
- Fix nearest 15th minute calculation in automations creation
2024-06-14 16:20:22 +05:30
sabaimran
971f1cd897 Add basic page about automations 2024-06-14 15:52:30 +05:30
sabaimran
17bce930ba Add a documentation page for keyboard shortcuts 2024-06-14 14:30:31 +05:30
Raghav Tirumale
35715096f4 UX Improvement: Keyboard Shortcuts for Recent Messages (#804)
* added keyboard shortcuts to access old queries
2024-06-14 12:45:09 +05:30
sabaimran
2dcfb3c2f0 Fix bug for drag and drop single file 2024-06-14 12:01:10 +05:30
sabaimran
7e4a61f2ac Disable rate limiting if billing is not enabled 2024-06-12 21:39:02 +05:30
Debanjum Singh Solanky
385057f09e Make khoj.el chat API call async to not block user interactions 2024-06-12 21:04:48 +05:30
sabaimran
45e725ac9c Use the summarizer model for generating improved image prompts 2024-06-12 17:41:12 +05:30
Raghav Tirumale
673d0d367c Fix: Adding Support for Uploading Multiple Files (#803)
* added support for uploading multiple files at a time.

* optimized multiple file upload to use a batch upload

* allowing files to upload even if there is one unsupported file
2024-06-12 15:51:35 +05:30
Debanjum Singh Solanky
906ebee075 Open Khoj chat, search in right pane to allow for ambient engagement
See the currently active window in context while doing chat, search
or find similar operations in a side pane.

This is similar to how we've moved Khoj on Obsidian into the side pane
as well
2024-06-09 23:32:34 +05:30
Debanjum Singh Solanky
cd4baa3fa5 Fix loading chat history, references in khoj.el chat buffer 2024-06-09 18:34:00 +05:30
Debanjum
6afbd8032e Improve Intermediate Steps in Formulating Chat Response (#799)
# Major
- Disambiguate Text output mode to disambiguate from Default data source lookup
- Fix showing headings in intermediate step in generating chat response
- Remove "Path" prefix from org ancestor heading in compiled entry

# Minor
- Fix OpenAI chat actor, director unit tests
2024-06-09 07:55:01 +05:30
Debanjum Singh Solanky
f440ddbe1d Fix openai chat actor, director tests
- Update test ChatModelOptions setup since update to it's schema
- Fix stale function calls using their updated signatures
2024-06-09 07:24:47 +05:30
sabaimran
2e209ab28b Handle case where conversation does not (yet) exist 2024-06-08 16:22:12 +05:30
sabaimran
849c38c0a4 Add support for managing audiences for new users 2024-06-08 15:51:17 +05:30
sabaimran
06a47ee457 Add language-specific syntax highlighting via highlight.js (#802)
* Add language-specific syntax highlighting via highlight.js

- Add highlight.js to our assets CDN for fast load and compliance with the CSP
- See other stylesheets options here: https://cdnjs.com/libraries/highlight.js

* Bonus: set min-height to prevent increasing length of the sessions pane

* Fix references rendering and add highlight.js in public conversation
2024-06-08 15:17:09 +05:30
Debanjum Singh Solanky
5f2442450c Update truncation test to reduce flakyness in cloud tests
Removed dependency on faker, factory for the truncation tests as that
seems to be the point of flakiness
2024-06-07 19:42:48 +05:30
sabaimran
dbb06466bf Minor fit/finish updates to the file filter experience 2024-06-07 15:05:00 +05:30
sabaimran
58a02f06ea Fix multilingual font rendering (#797)
* Fix multilingual font rendering; fallback to an Arabic language font which contains more Asian characters. Close #756

* Tune font-sizes and styling to accomodate new fonts with old sizing

- Move connection-status styling out from inline html into css block
- Remove start typing chat-input height jitter
- align new-conversation button, text
- use relative font sizes instead of absolute font sizes in most places

---------

Co-authored-by: Debanjum Singh Solanky <debanjum@gmail.com>
2024-06-07 11:53:47 +05:30
Raghav Tirumale
ba16afd3c2 New Feature: Adding File Filtering to Conversations (#788)
* UI update for file filtered conversations
* Interactive file menu #UI to add/remove files on each conversation as references.
* Backend changes implemented to load selected file filters from a conversation into the querying process.
---------

Co-authored-by: sabaimran <narmiabas@gmail.com>
2024-06-07 10:53:37 +05:30
Debanjum Singh Solanky
f91cdf8e18 Fix showing headings in intermediate step in generating chat response 2024-06-06 16:52:23 +05:30
Debanjum Singh Solanky
18f7e6e7ed Remove "Path" prefix from org ancestor heading in compiled entry 2024-06-06 16:51:26 +05:30
sabaimran
8d701ebe22 Add fedCM to accommodate google migration (#798)
- See migration guidelines here: https://developers.google.com/identity/gsi/web/guides/fedcm-migration#fedcm_flag
2024-06-06 14:23:16 +05:30
Debanjum Singh Solanky
dd2225b1aa Use Text output mode to disambiguate from Default data source lookup
Previously if default output was selected by Khoj, we'd end up doing
an documents search as well, even when Khoj selected internet or
general data source to lookup.

This update disambiguates the default information mode from the text
output mode. To avoid doing documents search when not deemed necessary
by Khoj
2024-06-06 11:56:48 +05:30
Debanjum Singh Solanky
a1e4f4bde7 Gracefully skip indexing when empty list of docs provided
Improve error message when fail to index content
2024-06-05 19:39:15 +05:30
Debanjum Singh Solanky
21987f60c7 Use `-difference' to get files to delete. Make batch size defcustom
Improve docstrings to align with `checkdoc' requirement for all args
being mentioned
2024-06-05 19:39:15 +05:30
Debanjum
bfacd65971 Batch upload files for indexing from the Emacs client (#735) from yuzhou721/master
Encode filenames and batch file uploads to improve sending content to index from the Emacs client
2024-06-05 19:31:06 +05:30
sabaimran
a9c383e62c Use an ASGI application, rather than WSGI
- ASGI should be the preferred application, as our codebase runs a lot of async code
2024-06-05 09:25:08 +05:30
sabaimran
0816cec4bc Manually close old db connections periodically 2024-06-04 22:19:47 +05:30
sabaimran
acfdc8da77 Explicitly set the connection age to 0 in the django settings. Seems to be some strange behavior with async gunicorn + django db 2024-06-04 20:31:51 +05:30
Debanjum Singh Solanky
85a343363b Release Khoj version 1.13.0 2024-06-04 11:57:44 +05:30
Debanjum
1dfd6d7391 Merge pull request from GHSA-h2q2-vch3-72qm
Add CSP and sanitize chat messages in Obsidian, Desktop, Web apps
2024-06-04 11:29:21 +05:30
Debanjum Singh Solanky
b757ba664f Sanitize chat messages to render in Obsidian, Desktop, Web apps
Use DOMPurify to escape any unsafe HTML in chat message before adding
it to DOM via innerHTML updates to a HTML element
2024-06-04 10:53:30 +05:30
Debanjum Singh Solanky
9f80c2ab76 Enforce Content-Security-Policy (CSP) in Obsidian, Desktop, Web apps
Prevent XSS attacks by enforcing Content-Security-Policy (CSP) in apps.
Do not allow loading images, other assets from untrusted domains.

- Only allow loading assets from trusted domains
  like 'self', khoj.dev, ipapi for geolocation, google (fonts, img)
  - images from khoj domain, google (for profile pic)
  - assets from khoj domain
  - Do not allow iframe src
  - Allow unsafe-inline script and styles for now as markdown-it escapes html
    in user, khoj chat

- Add hostURL to CSP of the Desktop, Obsidian apps
  Given web client is served by khoj server, it doesn't need to
  explicitly allow for khoj.dev domain. So if user self-hosting, it'll
  automatically allow the domain in the CSP (via 'self')

  Whereas the Obsidian, Desktop clients allow configure the server URL.
  Note *switching server URL breaks CSP until app is reloaded*
2024-06-04 10:53:30 +05:30
Debanjum Singh Solanky
179c70dba8 Upgrade Khoj llama-cpp, django and jinja dependencies 2024-06-04 09:05:53 +05:30
Debanjum Singh Solanky
bbcdb8413d Add null checks, fix build errors in Khoj plugin on newer Obsidian 2024-06-03 18:03:11 +05:30
Debanjum Singh Solanky
d8ace4d34c Highlight the agents, automation tab when active on the web app 2024-06-03 16:57:03 +05:30
sabaimran
4679f07336 Clean up some of the design of agents, inspired by dicussion #792 2024-06-03 12:52:07 +05:30
Debanjum Singh Solanky
8cdab5f31a Update slash command UX in chat UI of desktop app to match web app
Make commands in popup menu on typing slash in chat input selectable
2024-06-02 17:27:37 +05:30
Debanjum Singh Solanky
7828bd6f2e Hide command popup & focus on chatInput on selecting command in web app
Style command popup cursor and add highlight to indicate using slash
command
2024-06-02 17:27:37 +05:30
Debanjum
cf8c9c2a3d Serve image assets from Khoj domain, not directly from S3 bucket (#734)
- Serve generated images from Khoj domain instead of directly from AWS S3
- Rename assets URL from Khoj S3 bucket to assets.khoj.dev
2024-06-02 17:24:35 +05:30
sabaimran
5bb3689562 Do not stream responses in the scheduled_chat response 2024-06-02 11:31:15 +05:30
sabaimran
5132b01ab1 Remove intent_type from telemetry update in api_chat 2024-06-02 10:21:38 +05:30
Raghav Tirumale
a3934b3aaa Improved Command Menu and Help Command (#774)
* The command menu (triggered by "/") now has a clickable list of possible commands, that automatically fill into the chat when pressed.
* The `/help` command now searches `khoj.dev` pages to provide useful assistance to the user.

---------

Co-authored-by: raghavt3 <raghavt3@illinois.edu>
Co-authored-by: sabaimran <65192171+sabaimran@users.noreply.github.com>
2024-06-01 22:33:31 +05:30
sabaimran
6d10f98498 Add additional lines for KHOJ_NO_HTTPS and KHOJ_DOMAIN in the docker-compose 2024-06-01 21:48:43 +05:30
sabaimran
841cbff249 Add documentation for setting up google auth in self-hosted khoj. Closes #771 2024-06-01 21:38:21 +05:30
sabaimran
89178bcebd Fix formatting issues for task email in mobile 2024-06-01 14:19:12 +05:30
Debanjum
b499b3fe2a Upgrade Khoj Obsidian: Chat from Side Pane, Stream Intermediate Steps, Copy Message to Clipboard (#736)
### Details
- **Chat with Khoj from right pane on Obsidian**
  - Modal was too ephemeral, couldn't have it open for reference, quick jump to Khoj chat
- **Stream intermediate steps taken by Khoj** for generating response to the chat pane
  Gives more transparency into Khoj 'thinking' process, e.g internet, notes searches performed, documents read etc. 
  The feedback allows us to tune our messages to elicit better responses by Khoj
- Add ability to **copy message to clipboard, paste chat messages directly into current file**
- Jump to **Search**, **Find Similar** functions from navigation bar on the Khoj Obsidian side pane
- Improve spacing, use consistent colors in chat message references and buttons

Resolves #789, #754
2024-06-01 13:29:21 +05:30
sabaimran
8b9c26c468 Remove unused method 2024-06-01 12:54:43 +05:30
sabaimran
5ec641837a Allow automations to be shareable (#790)
* Updating the API / UI to support sharing of automations
* Allow people to see the automations even when not logged in, and add an overlay effect
* Handle unauthenticated users taking actions
* Support showing pre-filled automation details on the config automations page
* Redirect user to login if they try to add an automation while unauthenticated
2024-06-01 12:44:49 +05:30
Debanjum Singh Solanky
7d7d4cf5c3 Make new chat message text selectable in Obsidian side pane
Resolves #789
2024-06-01 11:01:39 +05:30
Debanjum Singh Solanky
7fb7f200b3 Fix rendering text in chat messages with bulleted lists
Improves #789
2024-06-01 10:51:22 +05:30
Debanjum Singh Solanky
7a93599fe8 Merge branch 'master' into upgrade-khoj-on-obsidian
- Conflicts:
  - src/khoj/interface/web/chat.html
    Use our changes with feedback button changes from master
2024-06-01 10:07:43 +05:30
Debanjum Singh Solanky
92bab9fa61 Get Conversation session action buttons out from under the three dot menu 2024-05-31 20:11:00 +05:30
Debanjum Singh Solanky
7fa42daf89 Render action buttons for new Khoj chat responses in Obsidian
- Dedupe the code to add action buttons to chat messages
- Update the renderIncrementalMessage function to also add the action
  buttons to newly generated chat messages by Khoj
2024-05-31 20:11:00 +05:30
Debanjum Singh Solanky
2d010db83f Toggle chat session view on clicking the Obsidian chat sessions button 2024-05-31 20:11:00 +05:30
Debanjum Singh Solanky
275d4877a6 Fix loading spinner visibility by using contrasting background color
Fix code formating of Khoj chat view in Obsidian
2024-05-31 20:09:24 +05:30
sabaimran
2667ef4544 Refresh the conversation from the db in the websocket flow 2024-05-31 16:15:56 +05:30
sabaimran
fd07abbfc8 Decrease the life of one connection 2024-05-31 15:39:15 +05:30
Debanjum
3090b84252 Disable Minutely Recurrence for Automations (#781)
* Disable automation recurrence at minute level frequency

* Set a max lifetime for django's connections to the db

* Disable any automation that has a non-numeric first digit (i.e., recuring on the minute level)

* Re-enable automations

---------

Co-authored-by: sabaimran <narmiabas@gmail.com>
2024-05-31 12:50:19 +05:30
sabaimran
5dca48d9fc Fix setting of conn_max_age variable 2024-05-31 11:07:13 +05:30
sabaimran
76f941f4e5 Revert email from from to sender again in resend API. keeps switching? 2024-05-31 10:30:18 +05:30
sabaimran
b27f59b12b Remove all unused code related to websockets 2024-05-30 11:39:04 +05:30
sabaimran
4b3d3fe7ea /s/sender/from in resend calls 2024-05-30 08:43:46 +05:30
sabaimran
2076543e32 Disable AP Scheduler while performing maintenance 2024-05-30 08:02:59 +05:30
sabaimran
4aac84e1c1 Pin rsesend verison in pyproject.toml 2024-05-30 07:05:11 +05:30
Debanjum Singh Solanky
7823ef09dc Simplify conditional code. Improve logs to track conversion progress 2024-05-29 17:50:07 +05:30
Debanjum Singh Solanky
215db8cab3 Reduce log level of noisy process lock logs 2024-05-29 13:14:44 +05:30
Debanjum Singh Solanky
7b18919564 Tag external links to open in a separate window on the Desktop app
Previously clicking inline links would open the URL directly in the
Desktop app. This was strange and it didn't provide any way to go back
to Khoj desktop app UI from the opened link
2024-05-29 10:12:50 +05:30
Debanjum Singh Solanky
c957a6cb43 Delete unused base_processor_integration html file from web interface 2024-05-29 08:30:13 +05:30
sabaimran
7dd72c1d25 Fix trailing whitespace issue in development.mdx 2024-05-29 04:36:46 +05:30
sabaimran
cb33fb67fe Remove the automations-related dead code in the web config 2024-05-29 04:22:45 +05:30
Debanjum Singh Solanky
15c5873c20 Provide more context in docs for self-hosting Khoj on Windows 2024-05-28 20:56:26 +05:30
Debanjum Singh Solanky
7594401461 Fix expand chat reference animation in web, desktop, obsidian clients 2024-05-28 20:56:26 +05:30
Debanjum Singh Solanky
1ea7675fc9 View, switch chat sessions from Obsidian chat pane 2024-05-28 20:33:39 +05:30
Debanjum Singh Solanky
e86899eec4 Click on referenced notes by Khoj chat to open it in Obsidian vault
Allow opening Khoj chat references in Obsidian vault if the reference
is a heading or file in the current Obsidian vault
2024-05-28 10:16:40 +05:30
Debanjum
39faae68c0 Merge pull request #768 from MythicalCow/documentation/windows-development-fixes
Documentation Fixes for Development Page
2024-05-28 00:26:15 +05:30
Raghav Tirumale
4a8920f9a4 formatting fix 2024-05-27 12:52:08 -05:00
Raghav Tirumale
9a11a3cd63 Added installation notes for windows users and added postgres setup instructions. 2024-05-27 12:49:52 -05:00
Desmond
70fea6c6b6 fix: delete file request 2024-05-27 14:46:26 +08:00
sabaimran
607534021b Add a link to github in the settings menu, improve styling 2024-05-27 11:39:30 +05:30
Desmond
3f49b5a4ab fix: emacs tests 2024-05-27 10:42:09 +08:00
sabaimran
b97ca9d19d Skip using max_tokens as input to the extract questions step, as that's not used for max_output 2024-05-27 01:23:54 +05:30
sabaimran
9ebf3a4d80 Improve the admin experience, add more metadata to the list_display
- Don't propagate max_tokens to the openai chat completion method. the max for the newer models is fixed at 4096 max output. The token limit is just used for input
2024-05-27 00:49:20 +05:30
sabaimran
01cdc54ad0 Add support for Anthropic models (#760)
* Add support for chatting with Anthropic's suite of models

- Had to use a custom class because there was enough nuance with how the anthropic SDK works that it would be better to simply separate out the logic. The extract questions flow needed modification of the system prompt in order to work as intended with the haiku model
2024-05-26 22:50:34 +05:30
Debanjum Singh Solanky
0f796a79ec Extract function to get link to entry in Obsidian vault for reuse 2024-05-26 18:03:15 +05:30
Debanjum Singh Solanky
e24ca9ec28 Pass file path of each doc reference in references returned by API
- Pass file path of reference along with the compiled reference in
  list of references returned by chat API converts
- Update the structure of references from list of strings to list of
  dictionary (containing 'compiled' and 'file' keys)
- Pull out the compiled reference from the new references data struct
  wherever it was is being used
2024-05-26 18:02:11 +05:30
Debanjum Singh Solanky
ba330712f8 Fix to always pass online results in chat API response 2024-05-26 13:56:55 +05:30
Debanjum Singh Solanky
38d8d2bb56 Show online references used to generate response in Obsidian chat view 2024-05-26 13:55:22 +05:30
Debanjum Singh Solanky
f495d338eb Modularize render message with references func in web based clients
Simplify, reuse, standardize code to render messages with references
in the obsidian, web and desktop clients. Specifically:

- Reuse function to create reference section, dedupe code
- Create reusable function to generate image markdown
- Simplify logic to render message with references
2024-05-26 13:55:22 +05:30
Debanjum Singh Solanky
14a2006c76 Stream steps taken to generate response in Obsidian chat pane
- Setup websocket using Khoj web app as reference.
- Moved the geolocating code to chat view out from the general pane
  view
- Use loading spinner from web instead of the thinking emoji
2024-05-26 13:55:22 +05:30
Debanjum Singh Solanky
afcd22d30c Improve spacing, colors of chat message references and buttons
Works better with dark modes. References have more spacing and adhere
to background color of the chat message itself
2024-05-26 13:55:22 +05:30
Debanjum Singh Solanky
bd4931e70b Add ability to paste chat messages directly into current file
It'll replace any highlighted text with the chat message or if not
text is highlighted, it'll insert the chat message at the last cursor
position in the active file
2024-05-26 13:55:22 +05:30
Debanjum Singh Solanky
032ad3b521 Add ability to copy messages to clipboard from Obsidian Khoj chat 2024-05-26 13:55:22 +05:30
Debanjum Singh Solanky
57f1c53214 Create Nav bar for Obsidian pane. Use abstract View class for reuse
- Jump to chat, show similar actions from nav menu of Khoj side pane
  - Add chat, search icons from web, desktop app
  - Use lucide icon for find similar (for now)
  - Match proportions of find similar icon to khoj other icons via css, js

- Use KhojPaneView abstract class to allow reuse of common functionality like
  - Creating the nav bar header in side pane views
  - Loading geo-location data for chat context
  This should make creating new views easier
2024-05-26 13:55:22 +05:30
sabaimran
e2922968d6 Move some gifs to the assets s3 bucket and add instructions for Ollama, shareable conversations 2024-05-25 01:08:20 +05:30
sabaimran
e23c803cee Release Khoj version 1.12.1 2024-05-24 21:42:03 +05:30
sabaimran
0308699849 Use links from assets.khoj.dev to render images in the automations page 2024-05-24 20:18:02 +05:30
sabaimran
3f9c20a399 Make it easier to manage server-level chat settings (#729)
* Add support for server-wide model settings fix web page reading results returning logic
2024-05-24 20:15:18 +05:30
sabaimran
cbbbe2da9a Add a schedule picker and automations preview func (#747)
* Update suggested automations
* add a schedule picker when creating an automation
* Create a new conversation in flow of the automation scheduling in order to send a preview and deliver more consistent results
* Start adding in scaffolding to manually trigger a test job for an automation
* Add support for manually triggering automations for testing
* Schedule automation asynchronously
* Update styling of the preview button
* Improve admin lookup experience and prevent jobs from being scheduled to run every minute of everyday
* Ignore mypy issues on job info short description
2024-05-24 19:42:47 +05:30
Ikko Eltociear Ashimine
ac3e5089a2 docs: update typo in desktop.md (#744)
reponses -> responses
2024-05-24 03:52:03 +05:30
Md. Shahnewaz Siddique
3af06a3d5a Updated installation instructions for windows, linux in readme (#741) 2024-05-24 03:51:25 +05:30
sabaimran
4511c6ae7c Fix bug in chat feedback flow - user message not included during live chat 2024-05-21 14:55:39 -05:00
Desmond
a3c6045328 Merge remote-tracking branch 'origin/master' 2024-05-21 21:55:53 +08:00
Desmond
b0630c1a98 Simplify partition 2024-05-21 21:52:01 +08:00
sabaimran
0b7910d4af Pin th elangchain-community version explicitly 2024-05-21 05:26:17 -05:00
Raghav Tirumale
d57772f9e7 Add Feedback Buttons on Chat (#721)
### Description and Rationale for Changes
This feature includes thumbs up and thumbs down buttons on Khoj's chat responses that provide automated feedback. When a thumbs up/down button is clicked, the code sends an email to team@khoj.dev with the following:
* user query
* khoj's response
* whether the sentiment of the user was good or bad. 

This is critical in improving Khoj's nondeterministic LLM model for a better user experience.
### List of Changes
* new endpoint in `api_chat.py` (/feedback) that can be used to trigger mail sending).
* thumbs up and thumbs down buttons implemented in `chat.html`
* new function in `routers/email.py` to handle feedback email sending via resend
* `feedback.html` template for a formatted email with the feedback.

---------

Co-authored-by: mythicalcow <mythicalcow@linux.myguest.virtualbox.org>
Co-authored-by: sabaimran <narmiabas@gmail.com>
2024-05-20 16:29:08 -05:00
Debanjum
f941948d11 Merge pull request #738 from joshavant/patch-1
Improve telemetry.md disabling instructions in docs
2024-05-17 21:43:42 +05:30
Josh Avant
37ad1d5397 Update telemetry.md disabling instructions 2024-05-15 15:18:00 -05:00
sabaimran
7feaf34702 Fix capitalization, update suggeted prompt 2024-05-10 02:36:13 -07:00
sabaimran
b545aceb47 Use a simpler example for the sample automation and put schedule on top of instructions 2024-05-09 13:53:19 -07:00
sabaimran
2b8e5a86cc Update version for resent library in pyproject.toml 2024-05-09 13:43:27 -07:00
sabaimran
7ae00832bd Rname from parameter to sender in resend call 2024-05-09 13:29:39 -07:00
sabaimran
fbd76f8ebe Improve the UX of automations (#737)
* Improve the automations UX

- Add suggested jobs to elimiinate some of the cold start problem
- Make each of the tasks cards that are clickable/editable

* Hide suggested automations that have already been added

* Add a footer and reapply styling when a save action is taken on a card
2024-05-09 01:29:48 -07:00
sabaimran
70d0ee4310 Only remove the process lock from a process that created it 2024-05-08 10:14:52 -07:00
Desmond Deng
20303feb3a Merge branch 'khoj-ai:master' into master 2024-05-08 13:46:34 +08:00
Desmond
150cd18bf3 Update batch-size 2024-05-08 13:44:22 +08:00
Desmond
192cd53003 Batch send of index files 2024-05-08 13:38:40 +08:00
sabaimran
a50deb2762 Add better handling for empty responses 2024-05-07 11:49:33 -07:00
sabaimran
4aed6bd274 Add an admin view for subscriptions 2024-05-07 11:48:52 -07:00
sabaimran
77626d28d1 Include stack trace when automation is not successfully craeted 2024-05-07 06:52:41 -07:00
sabaimran
0c8c565ab0 Don't include the whole stack trace for an integrity error 2024-05-07 06:48:18 -07:00
Debanjum Singh Solanky
0a1a6cd041 Get detailed user info in Obsidian from the new v1/user API
Previously we were just getting user email from the /health API
Instead store the retrieved user info in the user settings
2024-05-07 04:37:26 +08:00
Debanjum Singh Solanky
f8f9d066db Focus on input field, scroll to latest message on opening chat pane
Previously scroll and chat input focus weren't applied as view hadn't
been rendered yet
2024-05-07 04:37:26 +08:00
Debanjum Singh Solanky
9f65e8de98 Open Khoj Chat as a Pane instead of a Modal
- Allows having it open on the side as you traverse your Obsidian notes
- Allow faster time to response, having responses visible for context
- Enables ambient interactions
2024-05-07 04:37:26 +08:00
sabaimran
9ae828cf11 Use asssets.khoj.dev for loading math katex rendering 2024-05-07 01:43:46 +08:00
sabaimran
cf0b7628d0 Add the url scheme to the public share url 2024-05-06 21:37:49 +08:00
sabaimran
f6aaecb04f Fix construction method for public share conversation URL 2024-05-06 08:32:51 +05:30
sabaimran
14c9bea663 Make conversations optionally shareable (#712)
* Make conversations optionally shareable

- Shared conversations are viewable by anyone, without a login wall
- Can share a conversation from the three dot menu
- Add a new model for Public Conversation
- The rationale for a separate model is that public and private conversations have different assumptions. Separating them reduces some of the code specificity on our server-side code and allows us for easier interpretation and stricter security. Separating the data model makes it harder to accidentally view something that was meant to be private
- Add a new, read-only view for public conversations
2024-05-05 23:16:04 +05:30
Debanjum Singh Solanky
80cbaca935 Serve generated images from Khoj domain instead of directly from S3
Use CNAME to forward requests from the khoj subdomain to the
equivalent S3 bucket
2024-05-04 20:07:10 +05:30
Debanjum Singh Solanky
425496844b Rename assets URL from Khoj S3 bucket to assets.khoj.dev
Server khoj assets from khoj domain
2024-05-04 20:07:10 +05:30
sabaimran
88daa841fd Rename process lock migration and add a reverse migration step 2024-05-04 20:05:00 +05:30
sabaimran
509a8a412c Throw an error if trying to create a process lock that already exists. Names should be unique 2024-05-04 19:03:53 +05:30
sabaimran
7100614de5 Add support for rendering math equations in the web view (#733)
- Add parsing logic for LaTeX-format math equations in the web chat
- Add placeholder delimiters when converting the markdown to HTML in order to avoid removing the escaped characters
- Add the `<!DOCTYPE html>` specification to the page
2024-05-04 15:59:17 +05:30
Debanjum Singh Solanky
d9b3482b1a Show error when required fields to create automation are not set 2024-05-04 11:17:30 +05:30
Debanjum Singh Solanky
91a5643c5c Use Preview label for Automate feature. Prefix mailto: link to contact 2024-05-04 10:59:17 +05:30
Debanjum Singh Solanky
fd2328ab40 Do not hard code base url of path to automation icon in chat message 2024-05-04 10:59:07 +05:30
sabaimran
a38f3227e2 Revert domain in task task send emails 2024-05-03 15:27:27 +05:30
sabaimran
a1263951e9 Use mail to in email contact link 2024-05-03 12:16:56 +05:30
sabaimran
7c9847fe48 Increase jitter to 60 2024-05-03 11:38:22 +05:30
sabaimran
737ebfd521 Make improvements to online search prompts and use a custom domain for automations emails 2024-05-03 10:47:42 +05:30
sabaimran
42e9504ba8 Use a different function for getting last run time, avoid async/sync issues 2024-05-02 12:13:45 +05:30
sabaimran
9e8491b814 Add experimental disclaimers to the automations 2024-05-02 11:40:37 +05:30
sabaimran
c418449311 Add additional robustness in verifying job execution parameters at run time 2024-05-02 11:13:04 +05:30
sabaimran
690e9d8ed3 Collapse the reminders after they're successfully scheduled 2024-05-02 09:55:04 +05:30
sabaimran
6b648ee3ad Add experimental disclaimer in the automation page 2024-05-02 09:21:27 +05:30
sabaimran
f4fbc91515 Remove the exclamation point from the email 2024-05-01 19:01:51 +05:30
sabaimran
bddd1d0fcb Quip, smart reminders 2024-05-01 16:39:07 +05:30
sabaimran
bc8b92a77d Release Khoj version 1.12.0 2024-05-01 16:30:48 +05:30
sabaimran
9d02c354dd Merge pull request #732 from khoj-ai/fit-and-finish/schedule-tasks
Fixes and improves for scheduled tasks
2024-05-01 03:16:09 -07:00
sabaimran
b499851097 Use the cleaned query as the reference query in the email notification 2024-05-01 15:33:11 +05:30
sabaimran
f24495e0e6 Fix time zone used in query history. Closes #694 2024-05-01 15:31:48 +05:30
sabaimran
7fd57d737e Adjustments to improve overall styling of config page, email template 2024-05-01 14:19:47 +05:30
sabaimran
28578310d1 Add log line when sending a task-related email 2024-05-01 13:56:02 +05:30
sabaimran
a86f95117e Add the subject generation prompt and helper method 2024-05-01 13:55:32 +05:30
sabaimran
c30ba2e551 Set subject dynamically when creating new tasks, and make some minor improvments to the automations UI 2024-05-01 13:54:59 +05:30
sabaimran
d1b2037676 Shutdown the scheduler when the application is exiting 2024-05-01 13:53:34 +05:30
Debanjum
10f623154e Enable Creating Automations from Khoj (#731)
## Support Scheduling Automations (#695) 
   1. Detect when user intends to schedule a task, aka reminder
      - Support new `reminder` output mode to the response type chat actor
      - Show examples of selecting the reminder output mode to the response type chat actor
   2. Extract schedule time (as cron timestring) and inferred query to run from user message
   3. Use APScheduler to call chat with inferred query at scheduled time

## Make Automations Persistent (#714) 
  - Make scheduled jobs persistent and work in multiple worker setups
  - Add new operation Scheduled Job to Operation enum of ProcessLock

## Add UX to Configure Scheduled Tasks (#715)
  - Add section in settings page to view, delete your scheduled tasks
  - Add API endpoints to get and delete user scheduled tasks

## Make Automations more Robust. Improve UX (#718)
  - Decouple Task Run from User Notification
  - Make Scheduling more Robust
    - Use JSON mode to get parse-able output from chat model
    - Make timezone calculation programmatic on server instead of asking chat model
    - Use django-apscheduler to handle apscheduler and django interfacing
  - Improve automation UX. Move it out into separate top level page
    - Allow creating, modifying automations from the automations page
    - Infer cron from natural language client side to avoid roundtrip
2024-05-01 11:08:19 +05:30
Debanjum Singh Solanky
89a8dbb81a Fix edit job API. Use user timezone, pass all reqd. params to automation
- Pass user and calling_url to the scheduled chat too when modifying
  params of automation
- Update to use user timezone even when update job via API
- Move timezone string to timezone object calculation into the
  schedule automation method
2024-05-01 10:29:49 +05:30
Debanjum Singh Solanky
19c5af3ebc Handle natural language to cron translation error on web client 2024-05-01 09:10:18 +05:30
Debanjum Singh Solanky
70ee9ddf91 Merge migrations from main with feature branch 2024-05-01 09:10:18 +05:30
Debanjum Singh Solanky
8f28f6cc1e Remove now unused location data from being passed to automation funcs 2024-05-01 08:48:16 +05:30
Debanjum Singh Solanky
815966cb25 Unify, modularize DB adapters to get automation metadata by user further 2024-05-01 08:47:50 +05:30
Debanjum Singh Solanky
21bdf45d6f Add link to Automate page in nav pane of the web app 2024-05-01 08:47:50 +05:30
Debanjum Singh Solanky
bd5008136a Move automations into independent page. Allow direct automation
- Previously it was a section in the settings page. Move it to
  independent, top-level page to improve visibility of feature

- Calculate crontime from natural language on web client before
  sending it to to server for saving new/updated schedule to disk.
  - Avoids round-trip of call to chat model

- Convert POST /api/automation API endpoint into a direct request for
  automation with query_to_run, subject and schedule provided via the
  automation page. This allows more granular control to create automation
  - Make the POST automations endpoint more robust; runs validation
    checks, normalizes parameters
2024-05-01 08:47:48 +05:30
Debanjum Singh Solanky
cbc8a02179 Make, use func for constructing the automation created response
- Dedupe logic across http, ws chat API endpoints
- Reduces size of already too long http, ws chat API endpoint funcs
2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
c52ed333fa Make content, cards on config pages occupy the whole middle column
- Make the config page content use  the same top level 3-column layout
  as the khoj-header-wrapper
  This ensures the content is aligned with heading pane width
- Let cards and other settings sections scale to the width of their
  grid element. This utilizes more of the screen space and does it
  consistently across the different settings pages
2024-05-01 08:30:10 +05:30
sabaimran
ad4145e48c Fix unique has for job id 2024-05-01 08:30:10 +05:30
sabaimran
311d58e1ed Ensure the automated_task command is removed from the prepended query 2024-05-01 08:30:10 +05:30
sabaimran
eb65532386 Use Django ap scheduler in place of the sqlalchemy one 2024-05-01 08:30:10 +05:30
sabaimran
06213ea814 Fix token retrieval when executing the job and name async job approriately 2024-05-01 08:30:10 +05:30
sabaimran
ca8a7d8368 Revert sync -> aync in send welcome email method 2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
6936875a82 Use DB adapter to unify logic to get, delete automation by auth user
To use place with logic to get, view, delete (and edit soon) automations
by (authenticated) user, instead of scattered across code
2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
1238cadd31 Allow editting query-to-run from the automation config section 2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
cb2b1dccc5 Add icon for Automation feature. Replace old icons for delete, new 2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
23f2057868 Allow creating automations from automation settings section in web ui
- Create new POST API endpoint to create automations
- Use it in the settings page on the web interface to create
  new automations

This simplified managing automations from the setting page by allowing
both delete and create from the same page
2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
2f9241b5a3 Rename scheduled task to automations across code and UX
- Fix query, subject parameters passed to email template
- Show 12 hour scheduled time in automation created chat message
2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
230d160602 Improve rendering task scheduled settings view and message
- Render crontime string in natural language in message & settings UI
- Show more fields in tasks web config UI
- Add link to the tasks settings page in task scheduled chat response
- Improve task variables names
  Rename executing_query to query_to_run. scheduling_query to
  scheduling_request
2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
d341b1efe8 Store, retrieve task metadata from the job name field 2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
ae10ff4a5f Create create_scheduled_task func to dedupe logic across ws, http APIs
Previously, both the websocket and http endpoint were implementing
the same logic. This was becoming too unwieldy
2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
8dfa0bf047 Simplify task scheduler prompt. No timezone conversion. Infer subject
- Make timezone aware scheduling programmatic, instead of asking the
  chat model to do the conversion. This removes the need for
  scratchpad and may let smaller models handle the task as well
- Make chat model infer subject for email. This should make the
  notification email more readable
- Improve email by using subject in email subject, task heading. Move
  query to email final paragraph, which is where task metadata  should
  go
2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
2c563ad280 Use hash of query in process lock id to standardize id format
- Using inferred_query directly was brittle (like previous job id)
- And process lock id had a limited size, so wouldn't work for larger
  inferred query strings
2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
3ce06a938c Render scheduled task response as html to improve readability in email 2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
c17dbbeb92 Render next run time in user timezone in config, chat UIs
- Pass timezone string from ipapi to khoj via clients
  - Pass this data from web, desktop and obsidian clients to server
- Use user tz to render next run time of scheduled task in user tz
2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
6736551ba3 Improve scheduled task text rendered in UI 2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
0e01362469 Merge DB migrations from master with those from scheduled task feature 2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
a5ed4f2af2 Send email to share results of scheduled task 2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
69775b6d6e Add /task command. Use it to disable scheduling tasks from tasks
This takes the load of the task scheduling chat actor / prompt from
having to artifically differentiate query to create scheduled task
from a scheduled task run.
2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
22289a0002 Improve task scheduling by using json mode and agent scratchpad
- The task scheduling actor was having trouble calculating the
  timezone. Giving the actor a scratchpad to improve correctness by
  thinking step by step
- Add more examples to reduce chances of the inferred query looping to
  create another reminder instead of running the query and sharing
  results with user
- Improve task scheduling chat actor test with more tests and
  by ensuring unexpected words not present in response
2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
7f5981594c Only notify when scheduled task results satisfy user's requirements
There's a difference between running a scheduled task and notifying
the user about the results of running the scheduled task.

Decide to notify the user only when the results of running the
scheduled task satisfy the user's requirements.

Use sync version of send_message_to_model_wrapper for scheduled tasks
2024-05-01 08:30:10 +05:30
Debanjum Singh Solanky
7e084ef1e0 Improve job id. Fix refreshing list of jobs on delete from config page 2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky
a1e5195c8b Save separate user message time from Khoj response time in chat logs
Previously user message time was being stored the same as Khoj
response time in conversation logs.
2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky
5133b6e73b Minor improvements to styling the config page 2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky
648f1a5c71 Suffix chat response element vars with "El" in chat.html of web, desktop apps 2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky
98d0ffecf1 Add section in settings page to view, delete your scheduled tasks 2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky
423d61796d Add API endpoints to get and delete user scheduled tasks 2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky
af0972c539 Make scheduled jobs persistent and work in multiple worker setups
- Store scheduled job state in Postgres so job schedules persist
  across app restarts
- Use Process Locks to only allow single worker to process a given job
  type. This prevents duplicating job runs across all workers
2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky
fcf878e1f3 Add new operation Scheduled Job to Operation enum of ProcessLock 2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky
c28d7d3414 Add basic chat actor test to infer scheduled queries 2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky
c11742f443 Add chat actor to schedule run query for user at specified times
- Detect when user intends to schedule a task, aka reminder
  Add new output mode: reminder. Add example of selecting the reminder
  output mode
- Extract schedule time (as cron timestring) and inferred query to run
  from user message
- Use APScheduler to call chat with inferred query at scheduled time
- Handle reminder scheduling from both websocket and http chat requests

- Support constructing scheduled task using chat history as context
  Pass chat history to scheduled query generator for improved context
  for scheduled task generation
2024-05-01 08:28:59 +05:30
Debanjum Singh Solanky
9e068fad4f Handle null ref, when refresh conversation from db in websocket chat 2024-04-30 14:19:07 +05:30
sabaimran
37879a7850 Release Khoj version 1.11.2 2024-04-30 13:31:06 +05:30
sabaimran
93b41170d1 Refresh the conversation log from the db before addressing the next query 2024-04-30 13:27:51 +05:30
Debanjum Singh Solanky
f1545d2b2f Add, fix help link, improve title style in web ui config pages
- Align title text with icon better in all config cards
- Fix help link to github setup docs
- Fix help link to notion setup docs
2024-04-30 05:50:08 +05:30
Debanjum Singh Solanky
e6da0f9a8c Fix response type of delete client tokens API endpoint
Previously the make delete API response failed, after deleting token.
Required a page refresh to see that the API token was actually gone.

This was happening because the response type of the delete token API
endpoint isn't a string, so it failed FastAPI response validation
checks.
2024-04-30 02:46:52 +05:30
sabaimran
0f4c3518d3 Allow session cookies to be stored with a lax policy for some localhost scenarios 2024-04-29 15:48:45 +05:30
sabaimran
5beedc9734 Use Secure proxy ssl header only if no https 2024-04-29 15:33:21 +05:30
sabaimran
408f4780ce Add and update documentation for setting up khoj with an openai proxy server or offline llm 2024-04-27 20:16:32 +05:30
sabaimran
12258f02d7 Release Khoj version 1.11.1 2024-04-27 18:42:24 +05:30
sabaimran
2047b0c973 Support customization of the OpenAI base url in admin settings (#725)
- Allow self-hosted users to customize their open ai base url. This allows you to easily use a proxy service and extend support for other models.
- This also includes a migration that associates any existing openai chat model configuration with an openai processor configuration
- Make changing model a paid/subscriber feature
- Removes usage of langchain's OpenAI wrapper for better control over parsing input/output
2024-04-27 18:24:35 +05:30
sabaimran
49834e3b00 Add a hero image for the og:image meta tag 2024-04-27 17:07:21 +05:30
sabaimran
138f12f957 Fix indentation and revert first run message link styling to all links 2024-04-27 09:56:58 +05:30
Debanjum Singh Solanky
4395ed8065 Improve extract_questions func. Set message role to user, not assistant
Previous behavior of passing message with role = "assistant was
reducing instruction following quality of the model
2024-04-26 11:55:22 +05:30
Debanjum Singh Solanky
346499f12c Fix, improve args being passed to chat_completion args
- Allow passing completion args through completion_with_backoff
- Pass model_kwargs in a separate arg to simplify this
- Pass model in `model_name' kwarg from the send_message_to_model func
  `model_name' kwarg is used by langchain, not `model' kwarg
2024-04-26 11:55:22 +05:30
sabaimran
d8f2eac6e0 Release Khoj version 1.11.0 2024-04-25 17:24:59 +05:30
Debanjum Singh Solanky
1842017393 Skip trying to index deleted files, folders from Desktop app
Previously app would crash on startup if desktop app was told to
index a file that had been deleted afterwards
2024-04-25 15:23:05 +05:30
Debanjum
17a06f152c Support Llama 3 and Improve Offline Chat Actors (#724)
- Add support for Llama 3 in Khoj offline mode
- Make chat actors generate valid json with more local models
- Fix offline chat actor tests
2024-04-25 14:00:56 +05:30
Debanjum
220e5516ab Make Search Models More Configurable. Upgrade Default Cross-Encoder (#722)
- Upgrade default cross-encoder to mixedbread ai's mxbai-rerank-xsmall
- Support more embedding models by making query, docs encoding configurable
2024-04-25 13:55:49 +05:30
Debanjum Singh Solanky
cf08eaf786 Add comments explaining each field in the search model config in DB 2024-04-25 13:54:13 +05:30
Debanjum
4ee5ac7c20 Fix Chat UI and Indexing on Desktop App (#723)
- Make valid file extension checking case insensitive on Desktop app
- Skip indexing non-existent folders on Desktop app
- Pass auth headers to fix lazy load of chat messages on Desktop app
- Set chat-message height to height of content in web, desktop
2024-04-24 18:49:03 +05:30
Debanjum Singh Solanky
89ef23de50 Upgrade gunicorn and make it only a production dependency 2024-04-24 11:28:55 +05:30
Debanjum Singh Solanky
799efb5974 Create DB migration to add new fields and change default cross-encoder 2024-04-24 09:50:34 +05:30
Debanjum Singh Solanky
ec41482324 Upgrade default cross-encoder to mixedbread ai's mxbai-rerank-xsmall
Previous cross-encoder model was a few years old, newer models should
have improved in quality. Model size increases by 50% compared to
previous for better performance, at least on benchmarks
2024-04-24 09:50:09 +05:30
Debanjum Singh Solanky
7eaf9367fe Support more embedding models by making query, docs encoding configurable
Most newer, better embeddings models add a query, docs prefix when
encoding. Previously Khoj admins couldn't configure these, so it
wasn't possible to use these newer models.

This change allows configuring the kwargs passed to the query, docs
encoders by updating the search config in the database.
2024-04-24 09:49:17 +05:30
Debanjum Singh Solanky
f2db8d7d99 Fix offline chat actor tests
Do not check for original q in extracted questions. Since this was
removed in a previous commit
2024-04-24 09:40:00 +05:30
Debanjum Singh Solanky
4f7237b158 Make chat actors generate valid json with more local models
Improve tool, online search, webpage links, docs search chat actor
prompts. Ensure works with hermes-2-pro and llama-3.

Be more specific about generating JSON and not saying anything else.
2024-04-24 09:40:00 +05:30
Debanjum Singh Solanky
a2e4e4bede Add support for Llama 3 in Khoj offline mode
- Improve extract question prompts to explicitly request JSON list
- Use llama-3 chat format if HF repo_id mentions llama-3. The
  llama-cpp-python logic for detecting when to use llama-3 chat format
  isn't robust enough currently
2024-04-24 09:40:00 +05:30
Debanjum Singh Solanky
8e77b3dc82 Fix infer_max_tokens func when configured_max_tokens is set to None 2024-04-24 09:36:29 +05:30
Debanjum Singh Solanky
8196ab62f9 Make valid file extension checking case insensitive on Desktop app 2024-04-24 09:35:20 +05:30
Debanjum Singh Solanky
5def14e3bb Skip indexing non-existent folders on Desktop app 2024-04-24 09:35:20 +05:30
Debanjum Singh Solanky
cd05f262a6 Pass auth headers to fix lazy load of chat messages on Desktop app 2024-04-24 09:35:20 +05:30
Debanjum Singh Solanky
4d5d3e6433 Set chat-message height to height of content in web, desktop
In some cases, especially with image generation requests, this was
causing the chat messages to overlap in the chat UI
2024-04-24 09:35:20 +05:30
sabaimran
60658a8037 Get rid of enable flag for the offline chat processor config
- Default, assume that offline chat is enabled if there is an offline chat model option configured
2024-04-23 23:08:29 +05:30
sabaimran
ac474fce38 Ensure that the tokenizer and max prompt size are used the wrapper method 2024-04-23 21:22:23 +05:30
Olatoyan George
ad59180fb8 Added indication in the desktop UI for back-end connectivity (#711)
* Changed the styling of the link that takes a user to the settings page into a button
* added an indicator that shows if a user is connected to the server or not
* made a class name more descriptive and also made the text in first run message more intuitive
* changed the command to install dependencies in the README.md
* changed the class name of the first run message text to be more descriptive
* added icons in the desktop UI that shows if a file is synced successfully or not
* made the link class name in the homepage more descriptive
* fixed the hover issue on status box in the chat header pane
* fixed hovering issue on status box on macOS
2024-04-23 16:43:48 +05:30
Debanjum
419b044ac5 Use set, inferred max token limits wherever chat models are used (#713)
- User configured max tokens limits weren't being passed to
  `send_message_to_model_wrapper'
- One of the load offline model code paths wasn't reachable. Remove it
  to simplify code
- When max prompt size isn't set infer max tokens based on free VRAM
  on machine
- Use min of app configured max tokens, vram based max tokens and
  model context window
2024-04-23 16:42:35 +05:30
AjaySDwivedi1
abf6f963ea Replaced reinitialize and save all button to a sync button in config.… (#701)
Replaced reinitialize and save all button to a sync button in config
2024-04-23 16:42:11 +05:30
Debanjum Singh Solanky
c39c4e4ec4 Improve prompt for online search query generation chat actor
- Allow searching github, pypi for information about Khoj
- Enable creating multiple search queries by rewording prompt
2024-04-22 01:32:11 +05:30
Debanjum Singh Solanky
175169c156 Use set, inferred max token limits wherever chat models are used
- User configured max tokens limits weren't being passed to
  `send_message_to_model_wrapper'
- One of the load offline model code paths wasn't reachable. Remove it
  to simplify code
- When max prompt size isn't set infer max tokens based on free VRAM
  on machine
- Use min of app configured max tokens, vram based max tokens and
  model context window
2024-04-20 11:23:28 +05:30
Debanjum Singh Solanky
002cd14a65 Only let agent use online search tool if connected to it 2024-04-20 11:19:48 +05:30
Debanjum Singh Solanky
75c9ebbc54 Only show uvicorn debug logs at higher verbosity levels
Don't automatically show the uvicorn logs when in_debug_mode, only
show on at least verbosity = 2, i.e when start khoj with -vv flag
2024-04-20 11:18:01 +05:30
sabaimran
c6d668bacf Bump gunicorn workers per server up to 2 2024-04-18 11:32:51 +05:30
sabaimran
c9a8abafa4 Merge pull request #710 from khoj-ai/add-run-with-process-lock-and-fix-edge-cases
Extract run with process lock logic into func. Use it to re-index content
2024-04-17 01:29:02 -07:00
sabaimran
6de4a4873a Fix image-related client unit test 2024-04-17 13:28:48 +05:30
sabaimran
3132430737 Add tests for the db lock 2024-04-17 13:22:41 +05:30
sabaimran
d11354f9c8 Remove additional references to image content config 2024-04-17 13:00:50 +05:30
sabaimran
105dbf49e4 Fix max_duration_in_seconds for the update_embeddings job 2024-04-17 13:00:18 +05:30
Debanjum Singh Solanky
8e0bae894d Extract run with process lock logic into func. Use for content reindexing 2024-04-17 12:31:19 +05:30
Debanjum Singh Solanky
e9f608174b Fix access to Khoj admin panel from non HTTPS custom domains
To access the Khoj admin panel from a non HTTPS custom domain the
`KHOJ_NO_SSL' and `KHOJ_DOMAIN' env vars need to be explictly set.

See the updated setup docs for details.

Resolves #662
2024-04-17 03:20:05 +05:30
sabaimran
46210695b6 pin version of huggingface hub explicitly to ensure relevant constants are present. Closes #708 2024-04-17 01:09:36 +05:30
sabaimran
b0059654c9 Do not create an import error if the resend module is not available 2024-04-17 01:00:22 +05:30
sabaimran
f04ead7c37 Remove seting up log line for configuring image search 2024-04-17 00:45:39 +05:30
sabaimran
0208688801 Increase factor for n_ctx reduciton to 2e6 2024-04-17 00:41:36 +05:30
Debanjum Singh Solanky
1f2ffce85b Copy chat message with it's markdown formatting in Web, Desktop apps 2024-04-16 22:10:34 +05:30
sabaimran
91c8b137f1 Add a database lock for jobs that shouldn't be run by multiple workers (#706)
* Add a database lock for jobs that shouldn't be run by multiple workers

* Import relevant functions from utils.helpers
2024-04-16 21:29:27 +05:30
sabaimran
adb2e8cc5f Check if n is populated before making a comparison 2024-04-16 02:05:58 +05:30
Debanjum Singh Solanky
6707ccc463 Check before updating "chat" key in meta_log in chat history API endpoint 2024-04-15 21:06:47 +05:30
Debanjum Singh Solanky
4e7812fe55 Use Django management cmd to update inline images in DB to/from WebP/PNG
This provides Khoj server admins more control on migrating their S3
images to WebP format from PNG
2024-04-15 20:19:49 +05:30
Debanjum Singh Solanky
7fab8d6586 Only use chat messages count in history API endpoint when set by client 2024-04-15 19:12:57 +05:30
Debanjum
6b3ef61dd2 Improve Chat Page Load Perf, Offline Chat Perf and Miscellaneous Fixes (#703)
### Store Generated Images as WebP 
- 78bac4ae Add migration script to convert PNG to WebP references in database
- c6e84436 Update clients to support rendering webp images inline
- d21f22ff Store Khoj generated images as webp instead of png for faster loading

### Lazy Fetch Chat Messages to Improve Time, Data to First Render
This is especially helpful for long conversations with lots of images
- 128829c4 Render latest msgs on chat session load. Fetch, render rest as they near viewport
- 9e558577 Support getting latest N chat messages via chat history API

### Intelligently set Context Window of Offline Chat to Improve Performance
- 4977b551 Use offline chat prompt config to set context window of loaded chat model

### Fixes
- 148923c1 Fix to raise error on hitting rate limit during Github indexing
- b8bc6bee Always remove loading animation on Desktop app if can't login to server
- 38250705 Fix `get_user_photo` to only return photo, not user name from DB

### Miscellaneous Improvements
- 689202e0 Update recommended CMAKE flag to enable using CUDA on linux in Docs
- b820daf3 Makes logs less noisy
2024-04-15 18:34:29 +05:30
Debanjum Singh Solanky
a352940dfd Use Django management command to update images URL in DB to WebP
This provides Khoj server admins more control on migrating their S3
images to WebP format from PNG
2024-04-15 17:53:41 +05:30
Debanjum Singh Solanky
7d8e8eb0cf Use Enum to type text-to-image intent of Khoj chat response 2024-04-15 17:53:40 +05:30
Debanjum Singh Solanky
128829c477 Show latest msgs on chat session load. Fetch rest as they near viewport
- Reduces time to first render when loading long chat sessions
- Limits size of first page load, when loading long chat sessions

These performance improvements are maximally felt for large chat
sessions with lots of images generated by Khoj

Updated web and desktop app to support these changes for now
2024-04-15 16:10:56 +05:30
Debanjum Singh Solanky
9e5585776c Support getting latest N chat messages via chat history API
Get latest N if N > 0, else return all messages except latest N from
the conversation
2024-04-15 15:32:32 +05:30
Debanjum Singh Solanky
e5ff85f6fb Start fetching khoj css before icons to reduce time with no styling
This should reduce frequency of page load jitter when icons are loaded
before style is applied
2024-04-15 15:32:32 +05:30
Debanjum Singh Solanky
d5de59d411 Do not assume results key present in notion content when indexing 2024-04-15 08:02:20 +05:30
Debanjum Singh Solanky
4977b55106 Use offline chat prompt config to set context window of loaded chat model
Previously you couldn't configure the n_ctx of the loaded offline chat
model. This made it hard to use good offline chat model (which these
days also have larger context) on machines with lower VRAM
2024-04-14 02:35:36 +05:30
Debanjum Singh Solanky
689202e00e Update recommended CMAKE flag to enable using CUDA on linux in Docs 2024-04-14 02:35:27 +05:30
Debanjum Singh Solanky
148923c13a Fix to raise error on hitting rate limit during Github indexing 2024-04-13 22:09:13 +05:30
sabaimran
f24d71c71c Improve the agents UX (#702)
- Make the chat buttons look more clickable
- Show agent name in new conversation message
- Add an icon to the CTA to send agent a message
2024-04-13 20:11:37 +05:30
Debanjum Singh Solanky
78bac4ae05 Add migration script to convert PNG to WebP references in database 2024-04-13 19:06:28 +05:30
Debanjum Singh Solanky
c6e8443631 Update clients to support rendering webp images inline
This is for self-hosted scenarios where AWS S3 uploads is not enabled
2024-04-13 13:11:18 +05:30
Debanjum Singh Solanky
d21f22ffa1 Store Khoj generated images as webp instead of png for faster loading 2024-04-13 13:03:32 +05:30
Debanjum Singh Solanky
b820daf38f Makes logs less noisy
- Show telemetry enabled/disabled state on init, not every 2 minutes
- Convert no docs synced logs to debug level instead of warning
  Having synced docs isn't as important to use Khoj now, unlike before
2024-04-13 11:22:58 +05:30
Debanjum Singh Solanky
b8bc6bee83 Always remove loading animation on Desktop app if can't login to server 2024-04-13 11:02:44 +05:30
Debanjum Singh Solanky
382507051f Fix get_user_photo to only return photo, not user name from DB 2024-04-13 11:02:30 +05:30
sabaimran
f06ec485cb Fix redirect url process for login flow, existing user 2024-04-12 17:10:05 +05:30
sabaimran
87b9a93fa1 Update assertion line to match new logic 2024-04-12 13:09:19 +05:30
sabaimran
b86e68a29d Make it easier to view agents in the admin page 2024-04-12 13:02:22 +05:30
sabaimran
e58bd0e485 Remove mbox file from list of files expected to be included 2024-04-12 12:55:22 +05:30
sabaimran
6634d603a8 Add links for contributors to use in the readme 2024-04-12 12:49:12 +05:30
sabaimran
1377a44a1a Suppress debug logs from uvicorn.error to avoid clutter from websockets
- If application is not in DEBUG_MODE
2024-04-12 12:12:16 +05:30
Debanjum Singh Solanky
89b8ec3546 Release Khoj version 1.10.2 2024-04-12 11:53:32 +05:30
Debanjum Singh Solanky
50b4788a91 Remove chat loading animation in login required state on Desktop app 2024-04-12 11:50:54 +05:30
Debanjum Singh Solanky
b3f4794d91 Remove the unnecessary async/await func chains on Desktop app 2024-04-12 11:49:25 +05:30
Debanjum Singh Solanky
1e30a072d4 Just use file ext to identify indexable files to fix Desktop app install
- Magika on Desktop app was too bloated (100Mb to 250Mb) and broke
  install for some reason. Not sure why it was causing the app install
  to fail but do not have time to currently investigate

- Just use file extensions whitelist it's good enough for now. Let
  server handle the deeper identification of file type
2024-04-12 11:16:07 +05:30
Debanjum Singh Solanky
5c7797dbca Only check content type if file extension cannot identify text file 2024-04-12 03:40:42 +05:30
Debanjum Singh Solanky
7d2ef728e6 Fix identifying pdf files on server
Introduced bug in previous commit that would stop indexing PDF files
as trying to check content_group instead of mime_type is application/pdf
2024-04-12 03:07:46 +05:30
Debanjum Singh Solanky
07f8fb5c5b Release Khoj version 1.10.1 2024-04-12 02:18:07 +05:30
Debanjum Singh Solanky
a7d9102c33 Make identifying text, code files with Magika more robust on server
Use identified content group rather than mime_type to find text files.
2024-04-12 02:12:26 +05:30
Debanjum Singh Solanky
60337086f9 Release Khoj version 1.10.0 2024-04-12 01:01:02 +05:30
Debanjum Singh Solanky
34c3f70203 Index only files with valid text extension in folders synced by Desktop app
This maintains consistent set of indexable files from Desktop app,
whether indexing via file or folder filters
2024-04-12 00:59:54 +05:30
Debanjum
9a48f72041 Index more text file types from Desktop, Github (#692)
### Index more text file types 
- Index all text, code files in Github repos. Not just md, org files
- Send more text file types from Desktop app and improve indexing them
- Identify file type by content & allow server to index all text files

### Deprecate Github Indexing Features
- Stop indexing commits, issues and issue comments in a Github repo
- Skip indexing Github repo on hitting Github API rate limit

### Fixes and Improvements
- **Fix indexing files in sub-folders from Desktop app**
- Standardize structure of text to entries to match other entry processors
2024-04-12 00:08:29 +05:30
Debanjum Singh Solanky
0819b83d0b Fix constructing status update strings for intermediate chat steps 2024-04-11 20:31:32 +05:30
Debanjum Singh Solanky
d15b9bc272 Tell doc search actor to not generate online queries for doc search
This can pick up irrelevant details from notes
2024-04-11 19:49:41 +05:30
Debanjum Singh Solanky
15a78b19ad Improve Inferred Document Search Query Extraction from GPT
Using stop_words = "\n" was preventing JSON responses with newlines in
them
2024-04-11 19:24:04 +05:30
Debanjum Singh Solanky
653681967e Show inferred document search queries in intermediate chat step on Web app 2024-04-11 19:24:04 +05:30
Debanjum Singh Solanky
997741119a Show better intermediate steps when responding to chat via web socket
- Show internet search, webpage read, image query, image generation steps
- Standardize, improve rendering of the intermediate steps on the web app

Benefits:
1. Improved transparency, allow users to see what Khoj is doing behind
   the scenes and modify their query patterns to improve response quality
2. Reduced websocket connection keep alive timeouts for long running steps
2024-04-11 18:04:40 +05:30
sabaimran
fae7900f19 Remove more 2024-04-11 00:27:44 +05:30
sabaimran
5d1dd3e2b7 If resend not enabled, don't send the welcome email 2024-04-10 23:52:42 +05:30
sabaimran
d2f9c43c8e Use datetime.timezone.utc instead of datetime.utc 2024-04-10 23:07:43 +05:30
Debanjum Singh Solanky
f2dc9709b7 Use Magika to more robustly identify text files to send for indexing
- `file-type' doesn't handle mis-labelled files or files without
   extensions well

- Only show supported file types in file selector dialog on Desktop app
  Use Magika to get list of text file extensions. Combine with other
  supported extensions to get complete list of supported file extensions.
  Use it to limit selectable files in the File Open dialog.

  Note: Folder selector will index text files with no extensions as well
2024-04-10 22:44:24 +05:30
sabaimran
3fe94a67b0 Send welcome emails when a new user signs up (#691)
* Don't trigger any re-indexing on server initailization

* Integrate Resend to send welcome emails when a new user signs up

- Only send if this is the first time they've signed in
- Configure welcome email with basic styling, as more complex designs don't work and style tag did not work
2024-04-10 19:57:33 +05:30
Debanjum
6d153022f6 Improve nav pane, chat session UI on Desktop, Web app (#693)
### Enable copying chat messages. Improve copy button behavior and styling
  - Add button to copy chat messages on Desktop, Web apps
  - Improve copy button's icon, hover color & click animation in Desktop, Web apps

### Improve Navigation, Chat Session Panes on Desktop, Web apps
  - Dynamically generate navigation menu based on user info from server
  - Create API endpoint to get authenticated user information
  - Collapse navigation tabs into icons on mobile. Add spacing to them
  - Add Chat navigation tab back to top pane on Web app
  - Use proper icons for Search, Chat and Agents tab on navigation pane
  
### Miscellaneous Improvements
  - Make current chat expand to full width when session panel collapsed on Desktop App
  - Add chat session loading spinner to Desktop App (same as Web app)
   
### Fixes 
  - Show title bar in Khoj desktop app on Windows to simplify close, minimize etc.
  - Only render first run setup message once if error or server not running
  - Fix showing Search navigation tab from Agent pages on web client
2024-04-10 19:54:12 +05:30
Debanjum Singh Solanky
48d249db9e Center the nav item text and user profile initial icons 2024-04-10 19:38:43 +05:30
Debanjum Singh Solanky
60f6a1c6f1 Use svg icons in nav pane to standardize styling on Web, Desktop apps
Emojis varied based on device. svg icons standardize icon styles of
the web, desktop apps
2024-04-10 19:38:43 +05:30
Debanjum Singh Solanky
cccea484e4 Pass username, location context in system prompt instead of chat message
The username and location in system prompt should disambiguate user
context from user's actual message for the chat model.

It doesn't need to be told to not mention the context or acknowledge
the context instructions in it's response, as it understands that this
information is just context and not part of the user's actual message.
2024-04-10 15:05:33 +05:30
Debanjum Singh Solanky
804c04f7b9 Do not render copy message button on every Khoj thinking step
Only render copy chat message button once, after message text is rendered
2024-04-10 14:48:36 +05:30
sabaimran
bb15c9605d Add a sitemap plugin 2024-04-10 14:35:04 +05:30
sabaimran
a4afada746 Remove client-side timeouts for the khoj socket 2024-04-10 13:35:25 +05:30
Debanjum Singh Solanky
cadeaac769 Align conversation sessions side panel on Desktop app with Web app
- Move new conversation button to right of "Conversation" title
- Reduce size of chat message loading ellipsis animation
- Add loading animation for chat session
2024-04-10 10:34:36 +05:30
Debanjum Singh Solanky
1c3d129e08 Add button to copy chat messages on Desktop client 2024-04-10 10:34:36 +05:30
Debanjum Singh Solanky
0a5a91619e Improve copy button's icon, hover color & click animation in Desktop UI 2024-04-10 10:34:36 +05:30
Debanjum Singh Solanky
184873213c Add button to copy chat messages on Web client 2024-04-10 10:34:36 +05:30
Debanjum Singh Solanky
f56522cb8e Improve copy button's icon, hover color & click animation in Web UI 2024-04-10 10:34:36 +05:30
Debanjum Singh Solanky
8ff3890ba8 Dynamically generate navigation menu based on user info from server 2024-04-10 10:34:36 +05:30
Debanjum Singh Solanky
94c69eb8e3 Create API endpoint to get authenticated user information
This help clients render UI with user information
2024-04-09 21:04:44 +05:30
Debanjum Singh Solanky
377e979800 Make current chat expand to full width when session panel collapsed
This behavior also matches web client behavior on chat session panel
collapse
2024-04-09 21:04:44 +05:30
Debanjum Singh Solanky
913dcdfbcd Only render first run setup message once if error or server not running 2024-04-09 21:04:44 +05:30
Debanjum Singh Solanky
3b630841bd s/aget_all_filenames_by_source/get_all_filenames_by_source as sync func 2024-04-09 21:04:44 +05:30
Debanjum Singh Solanky
e45edbb992 Collapse navigation tabs into icons on mobile. Add spacing to them 2024-04-09 21:04:44 +05:30
Debanjum Singh Solanky
93edd5427f Add Chat navigation tab back to top pane on web client
Reduces user confusion on how to go to chat pane
Add emoji's for each tab to provide cleaner, iconified division between
the nav options
2024-04-09 21:04:44 +05:30
Debanjum Singh Solanky
8159d1ab25 Fix showing Search navigation tab from Agent pages on web client
The `has_documents' flag wasn't being passed. So the search tab
always showing up as empty instead of being dynamically enabled if
documents had been indexed.
2024-04-09 21:04:44 +05:30
Debanjum Singh Solanky
76cb543347 Show title bar in Khoj desktop app on Windows 2024-04-09 21:04:44 +05:30
Debanjum Singh Solanky
f040418cf1 Fix indexing files in sub-folders on the Desktop app
- `fs.readdir' func in node version 18.18.2 has buggy `recursive' option
  See nodejs/node#48640, effect-ts/effect#1801 for details

- We were recursing down a folder in two ways on the Desktop app.
  Remove `recursive: True' option to the `fs.readdirSync' method call
  to recurse down via app code only
2024-04-09 20:19:40 +05:30
Debanjum Singh Solanky
a8dec1c9d5 Index all text, code files in Github repos. Not just md, org files 2024-04-09 20:19:40 +05:30
Debanjum Singh Solanky
8291b898ca Standardize structure of text to entries to match other entry processors
Add process_single_plaintext_file func etc with similar signatures as
org_to_entries and markdown_to_entries processors

The standardization makes modifications, abstractions easier to create
2024-04-09 20:19:40 +05:30
Debanjum Singh Solanky
079f409238 Skip indexing Github repo on hitting Github API rate limit
Sleep until rate limit passed is too expensive, as it keeps a
app worker occupied.

Ideally we should schedule job to contine after rate limit wait time
has passed. But this can only be added once we support jobs scheduling.
2024-04-09 20:19:40 +05:30
Debanjum Singh Solanky
d5c9b5cb32 Stop indexing commits, issues and issue comments in Github indexer
Normal indexing quickly Github hits rate limits. Purpose of exposing
Github indexer is for indexing content like notes, code and other
knowledge base in a repo.

The current indexer doesn't scale to index metadata given Github's
rate limits, so remove it instead of giving a degraded experience of
partially indexed repos
2024-04-09 20:19:40 +05:30
Debanjum Singh Solanky
7ff1bd9f8b Send more text file types from Desktop app and improve indexing them
- Allow syncing more file types from desktop app to index on server
  - Use `file-type' package to identify valid text file types on Desktop app

- Split plaintext entries into smaller logical units than a whole file
  Since the text splitting upgrades in #645, compiled chunks have more
  logical splits like paragraph, sentence.
  Show those (potentially) smaller snippets to the user as references

- Tangential Fix:
  Initialize unbound currentTime variable for error log timestamp
2024-04-09 20:19:40 +05:30
Debanjum Singh Solanky
89915dcb4c Identify file type by content & allow server to index all text files
- Use Magika's AI for a tiny, portable and better file type
  identification system
- Existing file type identification tools like `file' and `magic'
  require system level packages, that may not be installed by default
  on all operating systems (e.g `file' command on Windows)
2024-04-09 20:19:39 +05:30
sabaimran
312528d471 Fix typo in SECURE_PROXY_SSL_HEADER settings 2024-04-09 12:33:21 +05:30
sabaimran
e56c5e67dd Revert SSL Redirect setting as it prevents the admin page from loading 2024-04-09 12:24:48 +05:30
sabaimran
1770bb174b Add UUID to the KhojUser search fields and inc frequency of telemetry job to 2 mins 2024-04-09 11:51:51 +05:30
sabaimran
ab51ae9091 Use SECURE_SSL_REDIRECT to ensure requests are routed to https always 2024-04-09 10:18:12 +05:30
sabaimran
1c229dad91 Set daily limit for unsubsribed users to 5 in websocket API 2024-04-08 21:16:48 +05:30
sabaimran
27815d982c Redirect user to the login page when either of the csrf token inputs is missing 2024-04-08 20:22:17 +05:30
sabaimran
d257629f81 Handle case when properties field isn't present in the page 2024-04-08 16:15:47 +05:30
Debanjum
9b68062fa9 Add Sponsors Section to Readme 2024-04-08 03:09:24 -07:00
sabaimran
089e0d028b Add a more gracefull error message when the rate limit is exceeded 2024-04-08 15:20:54 +05:30
Debanjum
11ce3e2268 Update Text Chunking Strategy to Improve Search Context (#645)
## Major
- Parse markdown, org parent entries as single entry if fit within max tokens
- Parse a file as single entry if it fits with max token limits
- Add parent heading ancestry to extracted markdown entries for context
- Chunk text in preference order of para, sentence, word, character

## Minor
- Create wrapper function to get entries from org, md, pdf & text files
- Remove unused Entry to Jsonl converter from text to entry class, tests
- Dedupe code by using single func to process an org file into entries

Resolves #620
2024-04-08 13:56:38 +05:30
Debanjum Singh Solanky
9239c2c2ed Update drop large words test to ensure newlines considerd word boundary
Prevent regression to #620
2024-04-08 13:38:08 +05:30
Debanjum Singh Solanky
67b1178aec Remove debug logs generated while compiling org-mode entries 2024-04-08 13:01:24 +05:30
Debanjum
4eda79cc3a Support using Python 3.12 with Khoj (#690)
### Why
- Python 3.12 is the default Python on Ubuntu 24.04 LTS, Windows and Mac via Homebrew
- Python 3.12 has a bunch of improvements that can be explored with Khoj (e.g per core GIL for performance)

## Changes
- The latest PyTorch now supports Python 3.12
- RapidOCR for indexing image PDFs doesn't currently support python 3.12.
  But it's an optional dependency, so only install it if python < 3.12

### Testing
- Verified Khoj installs fine on Windows and Mac with Python 3.12
- Verified Khoj chat works fine on Mac, Windows with Python 3.12

Resolves #522
2024-04-08 11:43:34 +05:30
sabaimran
731ad03348 Skip indexing commits that are missing properties 2024-04-07 15:19:07 +05:30
sabaimran
376eaf64cd Check if results are present in the pages or db response in Notion 2024-04-07 15:19:07 +05:30
Debanjum Singh Solanky
8222615280 Do not add original user message to knowledge search queries for offline chat
It's not required anymore. The extracted questions by the offline chat
model being used should be good enough.
2024-04-07 11:29:35 +05:30
Debanjum Singh Solanky
e3deb29f8e Upgrade khoj.el workflow to use Python 3.11 2024-04-07 11:24:07 +05:30
Debanjum Singh Solanky
14fbf594b2 Support using Python 3.12 with Khoj
- RapidOCR for indexing image PDFs doesn't currently support python 3.12.
  It's an optional dependency anyway, so only install it if python < 3.12
- Run unit tests with python version 3.12 as well

Resolves #522
2024-04-07 11:23:44 +05:30
sabaimran
86c831f7e2 Add a link to the data sources portion in the clients documentation 2024-04-07 09:32:58 +05:30
sabaimran
351fb31a34 Add webpage search to socket codepath, add a feature page for online search 2024-04-07 09:23:29 +05:30
Debanjum Singh Solanky
4be4c53222 Release Khoj version 1.9.0 2024-04-05 17:13:58 +05:30
sabaimran
54db0152b9 Add link to the khoj cloud service for connection to Notion 2024-04-05 15:41:43 +05:30
sabaimran
81f1450c1c Update yarn.lock to sync with package.json for documentation 2024-04-05 15:36:23 +05:30
sabaimran
d22fd6dfe3 Get rid of unnecessary package-lock.json file 2024-04-05 15:34:02 +05:30
sabaimran
7d7ce92e46 Add updated information in docs about the Notion integration 2024-04-05 15:31:43 +05:30
sabaimran
2aedd3c819 Increase freq. of telemetry upload to every 5 minutes 2024-04-05 14:13:47 +05:30
sabaimran
3b1234d084 Await the calls to the db in the notion.py file 2024-04-05 13:58:14 +05:30
sabaimran
19c10b1418 Upgrade the package versions used in yarn.lock for the documentation project 2024-04-05 13:25:41 +05:30
sabaimran
00a67e9524 Add additional log lines when configuring the Notion settings for a user in the callback 2024-04-05 13:19:24 +05:30
sabaimran
d23f7da8e3 Handle the case where a previous serach model isn't set when updating the model 2024-04-05 13:18:51 +05:30
sabaimran
f57f9f672d Address Notion, Image tech debt in indexing code path (#687)
* Add support for using OAuth2.0 in the Notion integration
* Add notion to the admin page
* Remove unnecessary content_index and image search/setup references
* Trigger background job to start indexing Notion after user configures it
* Add a log line when a new Notion integration is setup
* Fix references to the configure_content methods
2024-04-05 12:10:03 +05:30
sabaimran
69dee75c34 Update the readme for accuracy, updated demos 2024-04-04 10:57:24 +05:30
sabaimran
a60321b68e Push khoj to include inline references when possible 2024-04-04 10:31:13 +05:30
sabaimran
5bdcb4e69c Wait for location data to be returned before setting up the socket connection 2024-04-04 10:31:13 +05:30
Debanjum Singh Solanky
00f599ea78 Fix passing flags to re.split to break org, md content by heading level
`re.MULTILINE' should be passed to the `flags' argument, not the
`max_splits' argument of the `re.split' func

This was messing up the indexing by only allowing a maximum of
re.MULTILINE splits. Fixing this improves the search quality to
previous state
2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky
32ac0622ff Extract dates from compiled text entries 2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky
29c1c18042 Increase search distance to get relevant content for chat post indexer update
More content indexed per entry would result in an overall scores
lowering effect. Increase default search distance threshold to counter that

- Details
  - Fix expected results post indexing updates
  - Fix search with max distance post indexing updates

- Minor
  - Remove openai chat actor test for after: operator as it's not expected anymore
2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky
ad4fa4b2f4 Fix adding file path instead of stem to markdown entries 2024-04-04 02:41:55 +05:30
sabaimran
720139c3c1 Fix all unit tests for test_text_search 2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky
44b3247869 Update logical splitting of org-mode text into entries
- Major
  - Do not split org file, entry if it fits within the max token limits
    - Recurse down org file entries, one heading level at a time until
      reach leaf node or the current parent tree fits context window
    - Update `process_single_org_file' func logic to do this recursion

  - Convert extracted org nodes with children into entries
    - Previously org node to entry code just had to handle leaf entries
    - Now it recieve list of org node trees
    - Only add ancestor path to root org-node of each tree
    - Indent each entry trees headings by +1 level from base level (=2)

- Minor
  - Stop timing org-node parsing vs org-node to entry conversion
    Just time the wrapping function for org-mode entry extraction
    This standardizes what is being timed across at md, org etc.
  - Move try/catch to `extract_org_nodes' from `parse_single_org_file'
    func to standardize this also across md, org
2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky
eaa27ca841 Only add spaces after heading if any tags in orgnode raw entry repr 2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky
2ea8a832a0 Log error when fail to index md file. Fix, improve typing in md_to_entries 2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky
44eab74888 Dedupe code by using single func to process an org file into entries
Add type hints to orgnode and org-to-entries packages
2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky
db2581459f Parse markdown parent entries as single entry if fit within max tokens
These changes improve context available to the search model.
Specifically this should improve entry context from short knowledge trees,
that is knowledge bases with sparse, short heading/entry trees

Previously we'd always split markdown files by headings, even if a
parent entry was small enough to fit entirely within the max token
limits of the search model. This used to reduce the context available
to the search model to select appropriate entries for a query,
especially from short entry trees

Revert back to using regex to parse through markdown file instead of
using MarkdownHeaderTextSplitter. It was easier to implement the
logical split using regexes rather than bend MarkdowHeaderTextSplitter
to implement it.
- DFS traverse the markdown knowledge tree, prefix ancestry to each entry
2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky
982ac1859c Parse markdown file as single entry if it fits with max token limits
These changes improve entry context available to the search model
Specifically this should improve entry context from short knowledge trees,
that is knowledge bases with small files

Previously we split all markdown files by their headings,
even if the file was small enough to fit entirely within the max token
limits of the search model. This used to reduce the context available
to select the appropriate entries for a given query for the search model,
especially from short knowledge trees
2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky
d8f01876e5 Add parent heading ancestory to extracted markdown entries for context
Improve, update the markdown to entries extractor tests
2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky
86575b2946 Chunk text in preference order of para, sentence, word, character
- Previous simplistic chunking strategy of splitting text by space
  didn't capture notes with newlines, no spaces. For e.g in #620

- New strategy will try chunk the text at more natural points like
  paragraph, sentence, word first. If none of those work it'll split
  at character to fit within max token limit

- Drop long words while preserving original delimiters

Resolves #620
2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky
a627f56a64 Remove unused Entry to Jsonl converter from text to entry class, tests
This was earlier used when the index was plaintext jsonl file. Now
that documents are indexed in a DB this func is not required.

Simplify org,md,pdf,plaintext to entries tests by removing the entry
to jsonl conversion step
2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky
28105ee027 Create wrapper function to get entries from org, md, pdf & text files
- Convert extract_org_entries function to actually extract org entries
  Previously it was extracting intermediary org-node objects instead
  Now it extracts the org-node objects from files and converts them
  into entries
- Create separate, new function to extract_org_nodes from files
- Similarly create wrapper funcs for md, pdf, plaintext to entries

- Update org, md, pdf, plaintext to entries tests to use the new
  simplified wrapper function to extract org entries
2024-04-04 02:41:55 +05:30
Debanjum Singh Solanky
f01a12b1d2 Improve styling of chat sessions side panel
- Move green server connected dot to the bottom. Show status when
  disconnected from server
- Move "New conversation" button to right of the "Conversation" title
- Center alignment of the new conversation and connection status buttons
2024-04-04 01:43:26 +05:30
sabaimran
dd1e5e145a Use List[Any] for typing 2024-04-03 21:46:41 +05:30
sabaimran
b8087c4c8e Add typing to empty list variables in github_to_entries 2024-04-03 21:41:36 +05:30
sabaimran
d036fdfc26 If tree is not in the contents, then just return empty files list 2024-04-03 17:55:25 +05:30
Debanjum Singh Solanky
f915b2bd14 Fix passing model_name param to chatml formatter for online chat 2024-04-03 17:21:43 +05:30
sabaimran
6aa88761b8 Skip creating the default agent if there's no default conversation config 2024-04-03 17:21:01 +05:30
sabaimran
9c42c8be6b Merge pull request #679 from khoj-ai/features/chat-socket-streaming
Add a websocket for streaming from the chat UI
2024-04-03 04:43:31 -07:00
sabaimran
b4f71e06b3 Add timeout after 10 minutes of inactivity on socket 2024-04-02 22:12:27 +05:30
sabaimran
f48426623d resolve merge conflict in chat.html 2024-04-02 17:29:48 +05:30
sabaimran
bf1187f465 Use new online/websearch logic and add agent to chat_metadata 2024-04-02 17:20:38 +05:30
sabaimran
867e1007d1 Remove superfluous newline 2024-04-02 17:20:08 +05:30
sabaimran
228ad68042 Merge with origin/master 2024-04-02 17:02:21 +05:30
sabaimran
776550d5ce Add a migration for updating the default chat model, update for existing users 2024-04-02 17:01:31 +05:30
sabaimran
47fc7e1ce6 Rebase with matser 2024-04-02 16:16:06 +05:30
Debanjum
215ab6e66a Extract More Dates from entries to improve Date Filter (#683)
- Overview
  - Extract more structured date variants (e.g with dot(.) & slash(/) separators, 2-digit year)
  - Extract some natural, partial dates as well from entries
- Capability
  Add ability to extract the following additional date forms:
  - Natural Dates: 21st April 2000, February 29 2024
  - Partial Natural Dates: March 24, Mar 2024
  - Structured Dates: 20/12/24, 20.12.2024, 2024/12/20
  Note: Previously only YYYY-MM-DD ISO-8601 structured date form was extracted for date filters
- Performance
  Using regexes is MUCH faster than using the `dateparser' python library
  It's a little crude but gives acceptable performance for large datasets
2024-04-02 16:14:53 +05:30
Debanjum
3c3e48b18c Migrate to Llama.cpp for Offline Chat (#680)
## Benefits
- Support all GGUF format chat models
- Support more GPUs like AMD, Nvidia, Mac, Vulcan (previously just Vulcan, Mac)
- Support more capabilities like larger context window, schema enforcement, speculative decoding etc.

## Changes
### Major
- Use llama.cpp for offline chat models
  - Support larger context window
  - Automatically apply appropriate chat template. So offline chat models not using llama2 format are now supported
  - Use better default offline chat model, NousResearch/Hermes-2-Pro-Mistral-7B
- Enable extract queries actor to improve notes search with offline chat
- Update documentation to use llama.cpp for offline chat in Khoj

### Minor
- Migrate to use NouseResearch's Hermes-2-Pro 7B as default offline chat model in khoj.yml
- Rename GPT4AllChatProcessor to OfflineChatProcessor Config, Model
- Only add location to image prompt generator when location known
2024-04-02 15:49:42 +05:30
Debanjum Singh Solanky
7afee2d55c Let offline chat model set context window. Improve, fix prompts 2024-03-31 16:19:35 +05:30
Debanjum Singh Solanky
4228965c9b Handle msg truncation when question is larger than max prompt size
Notice and truncate the question it self at this point
2024-03-31 15:50:06 +05:30
Debanjum Singh Solanky
c6487f2e48 Fix docs showing how to setup llama-cpp with Khoj 2024-03-31 15:36:40 +05:30
Debanjum Singh Solanky
886d49e3a4 Merge branch 'master' into migrate-to-llama-cpp-for-offline-chat 2024-03-31 00:59:20 +05:30
Debanjum Singh Solanky
4f65dde201 Release Khoj version 1.8.0 2024-03-31 00:06:15 +05:30
sabaimran
c0e78fd56d Fix broken get-started documentation links 2024-03-30 15:05:12 +05:30
sabaimran
dd2a3f712b Add more demo videos, images, add feature sections 2024-03-30 14:48:46 +05:30
sabaimran
4cb91a042e Add an agents feature page, and clarification around custom domains 2024-03-30 14:20:46 +05:30
sabaimran
928f273bbe Configure production setup for moving to single worker model 2024-03-30 10:35:55 +05:30
Debanjum Singh Solanky
7923903d21 Improve date filter regexes to extract structured, natural, partial dates
- Much faster than using dateparser
  - It took 2x-4x for improved regex to extracts 1-15% more dates
  - Whereas It took 33x to 100x for dateparser to extract 65% - 400% more dates
  - Improve date extractor tests to test deduping dates, natural,
    structured date extraction from content

- Extract some natural, partial dates and more structured dates
  Using regex is much faster than using dateparser. It's a little
  crude but should pay off in performance.

  Supports dates of form:
  - (Day-of-Month) Month|AbbreviatedMonth Year|2DigitYear
  - Month|AbbreviatedMonth (Day-of-Month) Year|2DigitYear
2024-03-30 00:07:19 +05:30
Debanjum Singh Solanky
104eeea274 Extract natural language and locale specific dates in content
Previously we just extracted dates in YYYY-MM-DD format from content
for date filterings during search.

Use dateparser to extract dates across locales and natural language

This should improve notes returned as context when chat searches
knowledge base with date filters

Fallback to regex for date parsing from content if dateparser fails

- Limit natural date extractor capabilities to improve performance
  - Assume language is english
    Language detection otherwise takes a REALLY long time
  - Do not extract unix timestamps, timezone
    - This isn't required, as just using date and approximating dates as UTC
2024-03-30 00:06:56 +05:30
Debanjum Singh Solanky
90c5b3c410 Update stale Khoj pypi package metadata
Use latest License, Intended Audience and Dev Status
2024-03-29 00:06:55 +05:30
sabaimran
1195f843a3 Remove forward slash from the root agents endpoint 2024-03-28 23:06:55 +05:30
Debanjum Singh Solanky
a374288cea Use OIDC TrustedPublisher to publish khoj python package to PyPi 2024-03-28 22:58:36 +05:30
sabaimran
3417164ec2 Bump gunicorn workers up to 8 2024-03-28 22:34:13 +05:30
sabaimran
a1729b9b9e Add telemetry for agents used in conversation, increase image width in agents page 2024-03-28 22:18:11 +05:30
sabaimran
d503b3e867 Use Personality vernacular in agent page
- When setting up the default agent, configure every conversation that doesn't have an agent to use the Khoj agent
- Fix reverse migration for the locale removal migration
2024-03-28 15:07:02 +05:30
sabaimran
e59de8c9b1 Constrain width/size of agent image in agents view 2024-03-28 13:32:11 +05:30
sabaimran
6cb38d92c0 Specify version of pypi gh publish action 2024-03-28 12:47:31 +05:30
sabaimran
56da96b2e9 Increase minimum python required in the pyproject, use python 3.11 for building the wheel in the workflow 2024-03-28 12:19:07 +05:30
sabaimran
22014cfcbc Merge pull request #682 from khoj-ai/features/full-integration-agents
Add support for custom agents configured by the server admin
2024-03-27 23:27:15 -07:00
sabaimran
17776daed8 Merge from master 2024-03-28 11:38:29 +05:30
sabaimran
32a505d841 Revert to using the nvidia base image for the next release 2024-03-28 11:37:37 +05:30
sabaimran
51d0c9b8b0 Add telemetry to keep state of new agents being used 2024-03-28 11:37:24 +05:30
sabaimran
46ebc55e2b Add a top tab for agents 2024-03-28 11:37:01 +05:30
sabaimran
8397187231 Use default agent when creating a new conversation without agent specified 2024-03-28 11:36:27 +05:30
Debanjum Singh Solanky
8c4ef9270d Fix using format string for logger in chat API endpoint 2024-03-27 16:31:22 +05:30
Debanjum Singh Solanky
4912c0ee30 Use extract queries actor to improve notes search with offline chat
Previously we were skipping the extract questions step for offline
chat as default offline chat model wasn't good enough to output proper
json given the time it took to extract questions.

The new default offline chat models gives json much more regularly and
with date filters, so the extract questions step becomes useful given
the impact on latency
2024-03-26 22:33:01 +05:30
Debanjum Singh Solanky
1ebd5c3648 Rename GPT4AllChatProcessor* to OfflineChatProcessor Config, Model 2024-03-26 22:33:01 +05:30
Debanjum Singh Solanky
2a0b943bb4 Use Hermes-2-Pro as default offline chat model in khoj.yml 2024-03-26 22:33:01 +05:30
Debanjum Singh Solanky
dcdd1edde2 Update docs to show how to setup llama-cpp with Khoj
- How to pip install khoj to run offline chat on GPU
  After migration to llama-cpp-python more GPU types are supported but
  require build step so mention how
- New default offline chat model
- Where to get supported chat models from on HuggingFace
2024-03-26 22:33:01 +05:30
Debanjum Singh Solanky
8ca39a436c Use llama.cpp for offline chat models
- Benefits of moving to llama-cpp-python from gpt4all:
  - Support for all GGUF format chat models
  - Support for AMD, Nvidia, Mac, Vulcan GPU machines (instead of just Vulcan, Mac)
  - Supports models with more capabilities like tools, schema
    enforcement, speculative ddecoding, image gen etc.
- Upgrade default chat model, prompt size, tokenizer for new supported
  chat models

- Load offline chat model when present on disk without requiring internet
  - Load model onto GPU if not disabled and device has GPU
  - Load model onto CPU if loading model onto GPU fails
  - Create helper function to check and load model from disk, when model
    glob is present on disk.

    `Llama.from_pretrained' needs internet to get repo info from
    HuggingFace. This isn't required, if the model is already downloaded

    Didn't find any existing HF or llama.cpp method that looked for model
    glob on disk without internet
2024-03-26 22:33:01 +05:30
Debanjum Singh Solanky
0a7392f6ec Only add location to image prompt generator when location known 2024-03-26 22:33:01 +05:30
sabaimran
fdf78525b4 Part 2: Add web UI updates for basic agent interactions (#675)
* Initial pass at backend changes to support agents
- Add a db model for Agents, attaching them to conversations
- When an agent is added to a conversation, override the system prompt to tweak the instructions
- Agents can be configured with prompt modification, model specification, a profile picture, and other things
- Admin-configured models will not be editable by individual users
- Add unit tests to verify agent behavior. Unit tests demonstrate imperfect adherence to prompt specifications

* Customize default behaviors for conversations without agents or with default agents

* Add a new web client route for viewing all agents

* Use agent_id for getting correct agent

* Add web UI views for agents
- Add a page to view all agents
- Add slugs to manage agents
- Add a view to view single agent
- Display active agent when in chat window
- Fix post-login redirect issue

* Fix agent view

* Spruce up the 404 page and improve the overall layout for agents pages

* Create chat actor for directly reading webpages based on user message

- Add prompt for the read webpages chat actor to extract, infer
  webpage links
- Make chat actor infer or extract webpage to read directly from user
  message
- Rename previous read_webpage function to more narrow
  read_webpage_at_url function

* Rename agents_page -> agent_page

* Fix unit test for adding the filename to the compiled markdown entry

* Fix layout of agent, agents pages

* Merge migrations

* Let the name, slug of the default agent be Khoj, khoj

* Fix chat-related unit tests

* Add webpage chat command for read web pages requested by user

Update auto chat command inference prompt to show example of when to
use webpage chat command (i.e when url is directly provided in link)

* Support webpage command in chat API

- Fallback to use webpage when SERPER not setup and online command was
  attempted
- Do not stop responding if can't retrieve online results. Try to
  respond without the online context

* Test select webpage as data source and extract web urls chat actors

* Tweak prompts to extract information from webpages, online results

- Show more of the truncated messages for debugging context
- Update Khoj personality prompt to encourage it to remember it's capabilities

* Rename extract_content online results field to webpages

* Parallelize simple webpage read and extractor

Similar to what is being done with search_online with olostep

* Pass multiple webpages with their urls in online results context

Previously even if MAX_WEBPAGES_TO_READ was > 1, only 1 extracted
content would ever be passed.

URL of the extracted webpage content wasn't passed to clients in
online results context. This limited them from being rendered

* Render webpage read in chat response references on Web, Desktop apps

* Time chat actor responses & chat api request start for perf analysis

* Increase the keep alive timeout in the main application for testing

* Do not pipe access/error logs to separate files. Flow to stdout/stderr

* [Temp] Reduce to 1 gunicorn worker

* Change prod docker image to use jammy, rather than nvidia base image

* Use Khoj icon when Khoj web is installed on iOS as a PWA

* Make slug required for agents

* Simplify calling logic and prevent agent access for unauthenticated users

* Standardize to use personality over tuning in agent nomenclature

* Make filtering logic more stringent for accessible agents and remove unused method:

* Format chat message query

---------

Co-authored-by: Debanjum Singh Solanky <debanjum@gmail.com>
2024-03-26 18:13:24 +05:30
Debanjum Singh Solanky
15ed208996 Use Khoj icon when Khoj web is installed on iOS as a PWA 2024-03-26 00:13:12 +05:30
sabaimran
f8eaff574f Change prod docker image to use jammy, rather than nvidia base image 2024-03-25 23:09:58 +05:30
sabaimran
2b5341f53a [Temp] Reduce to 1 gunicorn worker 2024-03-25 16:13:04 +05:30
sabaimran
991f500775 Do not pipe access/error logs to separate files. Flow to stdout/stderr 2024-03-25 16:12:39 +05:30
Debanjum
586654e2af Allow directly reading web pages, even when SERP not enabled (#676)
### Overview
Khoj can now read website directly without needing to go through the search step first

### Details
- Parallelize simple webpage read and extractor
- Rename extract_content online results field to web pages
- Tweak prompts to extract information from webpages, online results
- Test select webpage as data source and extract web urls chat actors

- Render webpage read in chat response references on Web, Desktop apps
- Pass multiple webpages with their urls in online results context

- Support webpage command in chat API
- Add webpage chat command for read web pages requested by user
- Create chat actor for directly reading webpages based on user message
2024-03-24 16:25:25 +05:30
Debanjum Singh Solanky
9e52ae9e98 Time chat actor responses & chat api request start for perf analysis 2024-03-24 15:47:38 +05:30
Debanjum Singh Solanky
dabf71bc3c Render webpage read in chat response references on Web, Desktop apps 2024-03-24 15:47:38 +05:30
Debanjum Singh Solanky
a2e79c94be Pass multiple webpages with their urls in online results context
Previously even if MAX_WEBPAGES_TO_READ was > 1, only 1 extracted
content would ever be passed.

URL of the extracted webpage content wasn't passed to clients in
online results context. This limited them from being rendered
2024-03-24 15:47:38 +05:30
Debanjum Singh Solanky
71b6905008 Parallelize simple webpage read and extractor
Similar to what is being done with search_online with olostep
2024-03-24 15:46:29 +05:30
Debanjum Singh Solanky
1167f6ddf9 Rename extract_content online results field to webpages 2024-03-24 15:46:29 +05:30
Debanjum Singh Solanky
b22a7dae5d Tweak prompts to extract information from webpages, online results
- Show more of the truncated messages for debugging context
- Update Khoj personality prompt to encourage it to remember it's capabilities
2024-03-24 15:46:29 +05:30
Debanjum Singh Solanky
85c62efca1 Test select webpage as data source and extract web urls chat actors 2024-03-24 15:46:29 +05:30
Debanjum Singh Solanky
ad6f6bb0ed Support webpage command in chat API
- Fallback to use webpage when SERPER not setup and online command was
  attempted
- Do not stop responding if can't retrieve online results. Try to
  respond without the online context
2024-03-24 15:46:29 +05:30
Debanjum Singh Solanky
a6b7432837 Add webpage chat command for read web pages requested by user
Update auto chat command inference prompt to show example of when to
use webpage chat command (i.e when url is directly provided in link)
2024-03-24 15:46:29 +05:30
sabaimran
8abc8ded82 Part 1: Server-side changes to support agents integrated with Conversations (#671)
* Initial pass at backend changes to support agents
- Add a db model for Agents, attaching them to conversations
- When an agent is added to a conversation, override the system prompt to tweak the instructions
- Agents can be configured with prompt modification, model specification, a profile picture, and other things
- Admin-configured models will not be editable by individual users
- Add unit tests to verify agent behavior. Unit tests demonstrate imperfect adherence to prompt specifications

* Customize default behaviors for conversations without agents or with default agents

* Use agent_id for getting correct agent

* Merge migrations

* Simplify some variable definitions, add additional security checks for agents

* Rename agent.tuning -> agent.personality
2024-03-23 22:09:38 +05:30
sabaimran
4deb849fb1 Merge branch 'features/add-agents-ui' of github.com:khoj-ai/khoj into features/chat-socket-streaming 2024-03-23 14:04:25 +05:30
sabaimran
8edbd7094f Let the name, slug of the default agent be Khoj, khoj 2024-03-23 14:03:58 +05:30
sabaimran
6b4c4f10b5 Merge branch 'features/add-agents-ui' of github.com:khoj-ai/khoj into features/chat-socket-streaming 2024-03-23 11:22:00 +05:30
sabaimran
20617614ae Merge branch 'features/customize-chat-with-agents' of github.com:khoj-ai/khoj into features/add-agents-ui 2024-03-23 11:20:57 +05:30
sabaimran
2399d91f61 Merge migrations 2024-03-22 10:05:33 +05:30
sabaimran
d38089ab57 Merge with origin 2024-03-22 09:55:33 +05:30
Debanjum Singh Solanky
7416ca9ae1 Lower the default gunicorn workers running on prod 2024-03-21 04:35:52 +05:30
Debanjum Singh Solanky
aed4313cfc Fix updating specific conversation by id from the chat API endpoint
- Use the conversation id of the retrieved conversation rather than the
  potentially unset conversation id passed via API
- await creating new chat when no chat id provided and no existing
  conversations exist
2024-03-21 02:46:52 +05:30
Debanjum Singh Solanky
ec6dc0daaf Bump up the default gunicorn workers running on prod 2024-03-20 22:56:09 +05:30
sabaimran
6ba0d8e379 Add a connected notification if the websocket is connected 2024-03-20 20:53:28 +05:30
sabaimran
255b69dc58 Add a comma delimeter between outputted search queries 2024-03-20 19:43:35 +05:30
sabaimran
d84188b221 Scroll down when a message is added in the chat interface's handle stream response method 2024-03-20 15:04:41 +05:30
sabaimran
70ad78990a Use a common method for sending a generic message to the client from the server in the ws connection 2024-03-20 15:04:14 +05:30
sabaimran
d4e83b060a Update the web UI for the chat interface to establish a connection via a socket to the server
- Move some common methods into separate functions to make the UI components more efficient
- The normal HTTP-based chat connection will still work and serves as a fallback if the websocket is unavailable
2024-03-20 14:34:47 +05:30
sabaimran
a346f79b39 Add support for chatting via the web socket connection
- Convert to a model of calling the search API directly with a function call (rather than using the API method)
- Gracefully handle websocket connection disconnects
- Ensure that the rest of the response is still saved, as it is currently, if the user disconects from the client
- Setup unchangeable context at the beginning of the session when the connection is established (like location, username, etc)
2024-03-20 14:33:33 +05:30
sabaimran
36af9776e6 Add the websockets dependency to pyproject.toml 2024-03-20 14:11:18 +05:30
Debanjum Singh Solanky
62a83dc9bb Fix online search actor to use natural dates not after: operator
The recently added after: operator to online search actor was too
restrictive, gave worse results than when just use natural language
dates in search query
2024-03-15 21:50:14 +05:30
Debanjum Singh Solanky
4a1e6a2275 Convert deleted old user requests log line to debug from info 2024-03-15 20:50:10 +05:30
Debanjum Singh Solanky
9a068dadbf Fix extract questions prompt to use YYYY-MM-DD date filter format 2024-03-15 18:43:18 +05:30
Debanjum
bb2693c792 Improve Chat Session UX, Fix Login, Chat Message Truncation (#677)
### Improve
- Improve delete, rename chat session UX in Desktop, Web app
- Get conversation by title when requested via chat API

### Fix
- Allow unset locale for Google authenticating user
- Handle truncation when single long non-system chat message
- Fix setting chat session title from Desktop app
- Only create new chat on get if a specific chat id, slug isn't requested
2024-03-15 18:19:36 +05:30
Debanjum Singh Solanky
ecddf98430 Handle truncation when single long non-system chat message
Previously was assuming the system prompt is being always passed as
the first message. So expected there to be at least 2 messages in logs.

This broke chat actors querying with single long non system message.

A more robust way to extract system prompt is via the message role
instead
2024-03-15 15:58:39 +05:30
Debanjum Singh Solanky
ec0c35b7ed Improve delete, rename chat session UX in Desktop, Web app
- Ask for Confirmation before deleting chat session in Desktop, Web app
- Save chat session rename on hitting enter in title edit input box
- No need to flash previous conversation cleared status message
- Move chat session delete button after rename button in Desktop app
2024-03-15 15:58:19 +05:30
Debanjum Singh Solanky
924b1215ce Allow unset locale for Google authenticated user 2024-03-15 15:35:20 +05:30
Debanjum Singh Solanky
c792fa819f Fix setting chat session title from Desktop app
Pass auth headers to not have the chat session title update request fail
2024-03-15 15:19:20 +05:30
Debanjum Singh Solanky
c9e05dc184 Get conversation by title when requested via chat API 2024-03-15 12:31:50 +05:30
sabaimran
724557fc7b Merge branch 'master' of github.com:khoj-ai/khoj into features/add-agents-ui 2024-03-15 12:14:34 +05:30
sabaimran
7fc484ba7a Merge branch 'master' of github.com:khoj-ai/khoj into features/customize-chat-with-agents 2024-03-15 12:13:28 +05:30
Debanjum Singh Solanky
cac26dafe3 Only create new chat on get if a specific chat id, slug isn't requested 2024-03-15 11:58:39 +05:30
sabaimran
416feb13ef Fix layout of agent, agents pages 2024-03-15 11:17:40 +05:30
sabaimran
1b3fc68a87 Fix unit test for adding the filename to the compiled markdown entry 2024-03-15 11:01:48 +05:30
sabaimran
d734be61cf Rename agents_page -> agent_page 2024-03-15 10:17:51 +05:30
Debanjum Singh Solanky
8cdfaf41ec Update project URLs to show on pypi project page 2024-03-15 04:03:39 +05:30
Debanjum Singh Solanky
08993ff109 Add new, remove old known chat models from model to prompt size map 2024-03-15 04:02:25 +05:30
Debanjum Singh Solanky
fba0338787 Release Khoj version 1.7.0 2024-03-15 00:08:32 +05:30
sabaimran
345afec47e Resolve merge conflicts/ use agent_slug instead of agent_id for lookup 2024-03-14 16:16:07 +05:30
Debanjum Singh Solanky
6118d1ff57 Create chat actor for directly reading webpages based on user message
- Add prompt for the read webpages chat actor to extract, infer
  webpage links
- Make chat actor infer or extract webpage to read directly from user
  message
- Rename previous read_webpage function to more narrow
  read_webpage_at_url function
2024-03-14 14:58:37 +05:30
Debanjum
e549824fe2 Improve OpenAI Chat Actors and their prompts (#673)
### Major
- Enforce json mode response from OpenAI chat actors prev using string lists
- Use `gpt-4-turbo-preview' as default chat model, extract questions actor
- Make Khoj read khoj website to respond with accurate, up-to-date information about itself
- Dedupe query in notes prompt. Improve OAI chat actor, director tests

### Minor
- Test data source, output mode selector, web search query chat actors
- Improve notes search actor to always create a non-empty list of queries
- Construct available data sources, output modes as a bullet list in prompts
- Use consistent agent name across static and dynamic examples in prompts
- Add actor's name to extract questions prompt to improve context for guidance
2024-03-14 12:44:40 +05:30
sabaimran
3caf0a79d8 Spruce up the 404 page and improve the overall layout for agents pages 2024-03-14 11:26:49 +05:30
sabaimran
c45030af44 Fix agent view 2024-03-14 11:13:19 +05:30
Debanjum Singh Solanky
a1ce12296f Fix rendering online with note references post streaming chat response
Previously only the notes references would get rendered post response
streaming when when both online and notes references were used to
respond to the user's message
2024-03-14 03:40:40 +05:30
Debanjum Singh Solanky
1aeea3d854 Fix opening external links from confirmation dialog box on desktop app 2024-03-14 02:29:22 +05:30
Debanjum Singh Solanky
2e5cc49cb3 Enforce json response from OpenAI chat actors prev using string lists
- Allow passing response format type to OpenAI API via chat actors
- Convert in-context examples to use json objects instead of str lists
- Update actors outputting str list to request output to be json_object
  - OpenAI's json mode enforces the model to output valid json object
2024-03-14 01:22:33 +05:30
Debanjum Singh Solanky
7211eb9cf5 Default to gpt-4-turbo-preview for chat model, extract questions actor
GPT-4 is more expensive and generally less capable than gpt-4-turbo-preview
2024-03-14 01:22:33 +05:30
Debanjum Singh Solanky
dd883dc53a Dedupe query in notes prompt. Improve OAI chat actor, director tests
- Remove stale tests
- Improve tests to pass across gpt-3.5 and gpt-4-turbo
- The haiku creation director was failing because of duplicate query in
  instantiated prompt
2024-03-14 01:22:33 +05:30
Debanjum Singh Solanky
70b04d16c0 Test data source, output mode selector, web search query chat actors 2024-03-14 01:22:33 +05:30
Debanjum Singh Solanky
14682d5354 Improve notes search actor to always create a non-empty list of queries
- Remove the option for Notes search query generation actor to return
  no queries. Whether search should be performed is decided before,
  this step doesn't need to decide that
- But do not throw warning if the response is a list with no elements
2024-03-14 01:22:33 +05:30
Debanjum Singh Solanky
f5734826cb Improve pick data source prompt to look online for info about Khoj
- Add examples where user queries requesting information about Khoj
  results in the "online" data source being selected
- Add an example for "general" to select chat command prompt
2024-03-14 01:21:13 +05:30
Debanjum Singh Solanky
9a516bed47 Construct available data sources, output modes as a bullet list in prompts 2024-03-14 00:34:57 +05:30
Debanjum Singh Solanky
f28fb89af8 Use consistent agent name across static and dynamic examples in prompts
Previously the examples constructed from chat history used "Khoj" as
the agent's name but all 3 prompts using the func used static examples
with "AI:" as the pertinent agent's name
2024-03-14 00:34:57 +05:30
Debanjum Singh Solanky
f5793149a9 Add actor's name to extract questions prompt to improve context for guidance 2024-03-14 00:34:57 +05:30
Debanjum Singh Solanky
73ad444086 Make online search Actor read khoj.dev for docs, info about Khoj
- Add example to read khoj.dev website for up-to-date info to setup,
  use khoj, discover khoj features etc.
- Online search should use site: and after: google search operators
  - Show example of adding the after: date filter to google search
- Give local event lookup example using user's current location in
  query
- Remove unused select search content type prompt
2024-03-14 00:34:57 +05:30
sabaimran
290712c3fe Add web UI views for agents
- Add a page to view all agents
- Add slugs to manage agents
- Add a view to view single agent
- Display active agent when in chat window
- Fix post-login redirect issue
2024-03-14 00:07:36 +05:30
Debanjum
3abe7ccb26 Improve Online Search Speed and Context (#670)
### Major
- Read web pages in parallel to improve chat response time
- Read web pages directly when Olostep proxy not setup
- Include search results & web page content in online context for chat response

### Minor
- Simplify, modularize and add type hints to online search functions
2024-03-11 22:16:30 +05:30
Debanjum Singh Solanky
dc86e44a07 Include search results & webpage content in online context for chat response
Previously if a web page was read for a sub-query, only the extracted
web page content was provided as context for the given sub-query. But
the google results themselves have relevant snippets. So include them
2024-03-11 18:41:02 +05:30
Debanjum Singh Solanky
d136a6be44 Simplify, modularize and add type hints to online search functions
- Simplify content arg to `extract_relevant_info' function. Validate,
  clean the content arg inside the `extract_relevant_info' function

- Extract `search_with_google' function outside the parent function
- Call the parent function a more appropriate `search_online' instead
  of `search_with_google'
- Simplify the `search_with_google' function using list comprehension.
  Drop empty search result fields from chat model context for response
  to reduce cost and response latency

- No need to show stacktrace when unable to read webpage, basic error
  is enough
- Add type hints to online search functions to catch issues with mypy
2024-03-11 18:41:02 +05:30
Debanjum Singh Solanky
88f096977b Read webpages directly when Olostep proxy not setup
This is useful for self-hosted, individual user, low traffic setups
where a proxy service is not required
2024-03-11 18:41:02 +05:30
Debanjum Singh Solanky
ca2f962e95 Read, extract information from web pages in parallel to lower response time
- Time reading webpage, extract info from webpage steps for perf
  analysis
- Deduplicate webpages to read gathered across separate google
  searches
- Use aiohttp to make API requests non-blocking, pair with asyncio to
  parallelize all the online search webpage read and extract calls
2024-03-11 18:41:02 +05:30
sabaimran
8e1445b15b Use agent_id for getting correct agent 2024-03-11 14:44:46 +05:30
sabaimran
6ab649312f Add a new web client route for viewing all agents 2024-03-11 14:40:40 +05:30
sabaimran
352168d6c2 Customize default behaviors for conversations without agents or with default agents 2024-03-11 14:20:28 +05:30
sabaimran
9b88976f36 Initial pass at backend changes to support agents
- Add a db model for Agents, attaching them to conversations
- When an agent is added to a conversation, override the system prompt to tweak the instructions
- Agents can be configured with prompt modification, model specification, a profile picture, and other things
- Admin-configured models will not be editable by individual users
- Add unit tests to verify agent behavior. Unit tests demonstrate imperfect adherence to prompt specifications
2024-03-11 12:45:24 +05:30
sabaimran
1da453306e Add num online for Discord badge 2024-03-10 17:48:30 +05:30
Debanjum
18fa3e2384 Rerank Search Results by Default on GPU machines (#668)
- Trigger
   SentenceTransformer Cross Encoder models now run fast on GPU enabled machines, including Mac ARM devices since UKPLab/sentence-transformers#2463

- Details
  - Use cross-encoder to rerank search results by default on GPU machines and when using an inference server
  - Only call search API when pause in typing search query on web, desktop apps
2024-03-10 15:15:25 +05:30
Debanjum Singh Solanky
53d402480c Rerank search results with cross-encoder when using an inference server
If an inference server is being used, we can expect the cross encoder
to be running fast enough to rerank search results by default
2024-03-10 15:09:46 +05:30
Debanjum Singh Solanky
44c8d09342 Only call search API when pause in typing search query on web, desktop apps
Wait for 300ms since stop typing before calling search API.

This smooths out UI jitter when rendering search results, especially
now that we're reranking for every search query on GPU enabled devices

Emacs already has 300ms debounce time. More convoluted to add
debounce time to Obsidian search modal, so not updating that yet
2024-03-10 14:29:24 +05:30
Debanjum Singh Solanky
1105d8814f Use cross-encoder to rerank search results by default on GPU machines
Latest sentence-transformer package uses GPU for cross-encoder. This
makes it fast enough to enable reranking on machines with GPU.

Enabling search reranking by default allows (at least) users with GPUs
to side-step learning the UI affordance to rerank results
(i.e hitting Cmd/Ctrl-Enter or ENTER).
2024-03-10 14:29:21 +05:30
Debanjum
8eb3c441ec Do not create new chat session when an old chat session is deleted (#669)
### Issue
Previously deleting a chat session from the side panel on desktop, web app would sometimes result in also creating a new chat session

### Fix
  `get_conversation_by_user' shouldn't return new conversation if
  conversation with requested id not found.

  It should only return new conversation if no specific conversation
  is requested and no conversations found for user at all

### Miscellaneous Improvements
  - Chat history load should be logged as call to that chat_history api,
    not the "chat" api
  - Show status updates of clearing conversation history in chat input
  - Simplify web, desktop client code by removing unnecessary new variables

### Repro
  - Delete a new chat, this calls loadChat via window.onload which
    calls server /chat/history API endpoint with conversationId set to
    that of just deleted conversation sporadically

    The call to GET chat/history API with conversationId set occurs
    when window.onload triggers before the conversationId is deleted
    by the delete button after the DELETE /chat/history API call (via race)

  - In such a scenario, get_conversation_by_user called by
    chat/history API with conversationId of deleted conversation
    returns a new conversation
2024-03-10 14:14:43 +05:30
Debanjum Singh Solanky
fd81446ba3 Do not create new chat session when an old chat session is deleted
- Fix
  `get_conversation_by_user' shouldn't return new conversation if
  conversation with requested id not found.

  It should only return new conversation if no specific conversation
  is requested and no conversations found for user at all

- Repro
  - Delete a new chat, this calls loadChat via window.onload which
    calls server /chat/history API endpoint with conversationId set to
    that of just deleted conversation sporadically

    The call to GET chat/history API with conversationId set occurs
    when window.onload triggers before the conversationId is deleted
    by the delete button after the DELETE /chat/history API call (via race)

  - In such a scenario, get_conversation_by_user called by
    chat/history API with conversationId of deleted conversation
    returns a new conversation

- Miscellaneous
  - Chat history load should be logged as call to that chat_history api,
    not the "chat" api
  - Show status updates of clearing conversation history in chat input
  - Simplify web, desktop client code by removing unnecessary new variables
2024-03-10 02:17:23 +05:30
Debanjum Singh Solanky
b7fad04870 Use consistent field name for queries in chat history & better image prompt 2024-03-09 19:11:03 +05:30
sabaimran
086d5f8324 Add link to drag drop pdf demo video 2024-03-09 17:02:23 +05:30
sabaimran
6aae9864d3 Fix Notion indexing and add an admin view for Entry objects 2024-03-09 16:25:23 +05:30
sabaimran
b3b6278af2 Update documentation to show how you can upload files 2024-03-09 15:58:13 +05:30
sabaimran
12d6c4da7d Only include inferred queries in the conversation history for images, not links. Overflow the side panel when too long 2024-03-09 11:59:35 +05:30
Debanjum Singh Solanky
42d4bc6b14 Document installing Khoj on Phone as a Progressive Web App (PWA) 2024-03-08 21:18:06 +05:30
sabaimran
e5cd0237e3 Release Khoj version 1.6.2 2024-03-08 17:04:03 +05:30
Debanjum Singh Solanky
446ac7649d Remove unused js method in web chat client, add newline to web data in prompt 2024-03-08 16:40:39 +05:30
Debanjum Singh Solanky
12d32ac99c Increase user visibility into more errors during image generation
Catch OpenAI connection error and errors during better image prompt
generation
2024-03-08 16:40:39 +05:30
sabaimran
ff31759423 Fix target determination in the copy programmatic output button 2024-03-08 16:33:12 +05:30
sabaimran
9f934929c6 Infer mime type from file ending when not available in browser. Don't output image in conversation turns 2024-03-08 12:34:26 +05:30
sabaimran
81beb7940c Upload generated images to s3, if AWS credentials and bucket is available (#667)
* Upload generated images to s3, if AWS credentials and bucket is available.
- In clients, render the images via the URL if it's returned with a text-to-image2 intent type
* Make the loading screen more intuitve, less jerky and update the programmatic copy button
* Update the loading icon when waiting for a chat response
2024-03-08 10:54:13 +05:30
sabaimran
13894e1fd5 add instructions for drag/drop files in sys prompt 2024-03-07 17:57:42 +05:30
sabaimran
7357b6eff1 Revert white-space preline and add more detailed help text when selecting file 2024-03-06 16:47:27 +05:30
sabaimran
b615c0719e Support upload for files via drag/drop in the web UI (#666)
* Add additional styling changes for showing UI changes when dragging file to the main screen

* Add a loading spinner when file upload is in progress, and don't index github/notion when indexing files

* Add an explicit icon for file uploading in the chat button menu

* Add appropriate dragover styling when picking a file from the file picker/browser

* Add a loading screen when retrieving chat history. Fix width of the chat window. Put attachment icon to the left of chat input
2024-03-06 16:43:05 +05:30
sabaimran
e323a6d69b Include additional user context in the image generation flow (#660)
* Make major improvements to the image generation flow

- Include user context from online references and personal notes for generating images
- Dynamically select the modality that the LLM should respond with
- Retun the inferred context in the query response for the dekstop, web chat views to read

* Add unit tests for retrieving response modes via LLM

* Move output mode unit tests to the actor suite, rather than director

* Only show the references button if there is at least one available

* Rename aget_relevant_modes to aget_relevant_output_modes

* Use a shared method for generating reference sections, simplify some of the prompting logic

* Make out of space errors in the desktop client more obvious
2024-03-06 13:48:41 +05:30
sabaimran
3cbc5b0d52 Add links to blog in docs 2024-03-02 17:37:18 +05:30
sabaimran
880368635e Set default value of KHOJ_DEBUG to False in the docker-compose file 2024-03-01 21:51:13 +05:30
Debanjum Singh Solanky
2d61591c22 Improve user visibility into errors during image generation 2024-02-29 13:19:13 +05:30
sabaimran
0bbb5cff85 Release Khoj version 1.6.1 2024-02-26 13:27:20 -08:00
sabaimran
c8194a7364 Make out of space errors in the desktop client more obvious 2024-02-26 11:53:36 -08:00
Debanjum Singh Solanky
956dd71d91 Clean entry before adding to DB and log when it fails
Remove \0 null characters from entry fields as this is causing
indexing errors
2024-02-27 01:19:34 +05:30
Debanjum Singh Solanky
bb613a8e1d Make indentation styling more compact on Obsidian client 2024-02-25 14:41:45 +05:30
Debanjum Singh Solanky
682b70011f Set chat body height to remove UX jitter on chat history load in Web, Desktop 2024-02-25 14:40:47 +05:30
Debanjum Singh Solanky
efe86ce159 Fix saved conversation logger to handle image responses 2024-02-25 13:46:32 +05:30
Debanjum Singh Solanky
4839f2901a Open external links in Desktop app with default app for url on OS
- Open external links using the default link handler registered on OS
  for the link type, e.g http:// -> firefox, mailto: thunderbird etc
- Confirm before opening non-http URL using an external app
2024-02-25 13:21:52 +05:30
Debanjum
170bce2c02 Fix, Improve rendering images in Obsidian, Desktop, Web clients (#659)
- Improve render of inferred query in image chat messages in Web, Desktop apps
- Add inferred queries to image chat responses in Obsidian client
- Fix rendering images from Khoj response in Obsidian client
2024-02-25 00:56:26 +05:30
Debanjum Singh Solanky
f84606325c Improve render of inferred query in image chat messages in Web, Desktop apps 2024-02-25 00:47:06 +05:30
Debanjum Singh Solanky
a2e53d5e41 Add inferred queries to image chat responses in Obsidian client 2024-02-25 00:24:58 +05:30
Debanjum Singh Solanky
9b61f0b5f7 Fix rendering images from Khoj response in Obsidian client 2024-02-25 00:11:11 +05:30
sabaimran
b9d0533d92 Misc. fixes to prompting, admin, and others (#658)
* Simplify and clarify prompt for selecting toolset dynamically

* Add error handling around call to OLOSTEP api

* Fix conversation admin page

* Skip adding none or empty entries in the chunking method
2024-02-24 10:25:42 -08:00
Debanjum Singh Solanky
0e0e751ef7 Improve docstring of entrypoint function to the emacs client 2024-02-24 21:09:41 +05:30
Debanjum
8855529637 Improve Syncing Obsidian Vault, Invalidate Static Assets in Browser Cache in Web Client (#657)
- Improve
  - Only send files modified since their last sync for indexing on server from the Obsidian client
- Fix 
  - Invalidate static asset browser cache in Web client when Khoj version changes
2024-02-24 20:20:30 +05:30
Debanjum Singh Solanky
a46f70c4b0 Remove deprecated lastSyncedFiles settings field from Obsidian client 2024-02-24 20:18:22 +05:30
Debanjum Singh Solanky
03a6b491b2 Warn when can't identify mimeType of files in Desktop, Obsidian clients 2024-02-24 19:59:03 +05:30
Debanjum Singh Solanky
3675ab4864 Only sync modified files from the Obsidian client
Previously we'd send all files in vault and let the server
deduplicate.

This changes takes inspiration from the desktop app, and only pushes
files which were modified after their previous sync with the server.

This should reduce the processing load on the server
2024-02-24 07:48:40 +05:30
Debanjum Singh Solanky
ddfbf31bc8 Append version query param to web asset URLs to bypass browser cache
Ensure latest assets are loaded when khoj version is updated
2024-02-24 06:49:25 +05:30
sabaimran
42773e808c Retrieve, create, and save conversations differently for ClientApplications (#656)
* Retrieve, create, and save conversations differently if they're coming from a client application

- Not all of our client apps will necessarily maintain state over the conversation IDs available to a user. For some (single-threaded conversations), it should just use a single conversation. Fix the code to do so

* Simplify conversation retrieval logic

* Keep 0 padding below chat response

* Add order_by sorting to retrieving the conversation without id
2024-02-23 11:32:00 -08:00
Debanjum
9afb2a14ef Fix and Improve Chat UI in Web, Desktop apps (#655)
### Improvements to Chat UI on Web, Desktop apps
- Improve styling of chat session side panel
- Improve styling of chat message bubble in Desktop, Web app
- Add frosted, minimal chat UI to background of Login screen
- Improve PWA install experience of Khoj

### Fixes to Chat UI on Web, Desktop apps
- Fix creating new chat sessions from the Desktop app
- Only show 3 starter questions even when consecutive chat sessions created

### Other Improvements
- Update Khoj cloud trial period to a fortnight instead of a week
- Document using venv to handle dependency conflict on khoj pip install

Resolves #276
2024-02-23 19:27:02 +05:30
Debanjum Singh Solanky
c70ca78cdc Improve PWA install experience for Khoj on Desktop, Mobile
- Resolve PWA issues thrown by Chrome/Edge
  - Add screenshot samples showcasing remember, browse and draw features
    - This can provide a richer app store like experience when
      installing Khoj PWA on Mobile or Desktop
    - Add wide and narrow screenshots to show Mobile vs Desktop UX
  - Add higher resolution favicon for PWA
- Use single web manifest instead of separate ones for Chat, Search
- Update manifest description with more details about Khoj features
2024-02-23 18:59:52 +05:30
Debanjum Singh Solanky
e10b260988 Update web login screen to show frosted minimal chat UI in background 2024-02-23 18:59:52 +05:30
Debanjum Singh Solanky
1b0318564e Log when conversation turn is saved to DB 2024-02-23 18:59:52 +05:30
Debanjum Singh Solanky
4c39960917 Make number of conversation starters to get from DB configurable 2024-02-23 18:59:52 +05:30
Debanjum Singh Solanky
50617594fd Only show 3 starter questions even when consecutive chat sessions created
Reset starter question suggestions before appending in web, desktop app

Otherwise previously it'd keep adding to existing starter question
suggestions on each new session creation if multiple consecutive new
chat sessions created.

This would result in more than the 3 expected starter questions being
displayed at a time
2024-02-23 18:59:52 +05:30
Debanjum Singh Solanky
102f5c3f53 Improve styling of chat session side panel
- Make collapse, expand toggle arrow point in the direction the action
  will expand the side panel in
- Make the collapsed side panel reduce to a 1px sliver
2024-02-23 18:59:52 +05:30
Debanjum Singh Solanky
6283d9fe83 Update Khoj cloud trial period to a fortnight instead of a week
- Improve rate limit error message wording
- Make the "too many requests" error message more robust. Should throw
  that exception fix self.request >= self.subscribed_requests because
  upgrading wouldn't fix this rate limiting
2024-02-23 18:33:56 +05:30
Debanjum Singh Solanky
05c1903784 Fix creating new chat sessions from the Desktop app
Code wasn't passing the authorization header in the POST request to
create new chat session
2024-02-23 18:33:56 +05:30
Debanjum Singh Solanky
8a219b6e9c Improve styling of chat message bubble in Desktop, Web app
- Respect newline with pre-line but not for bullets to improve
  formatting of responses by Khoj
- Respect bold font by loading tajawal font with other weights
- Reduce bottom margin in chat message bubble, its taking too much space
2024-02-23 18:33:56 +05:30
sabaimran
b4902090e7 Misc. chat and application improvements (#652)
* Document original query when subqueries can't be generated
* Only add messages to the chat message log if it's non-empty
* When changing the search model, alert the user that all underlying data will be deleted
* Adding more clarification to the prompt input for username, location
* Check if has_more is in the notion results before getting next_cursor
* Update prompt template for user name/location, update confirmation message when changing search model
2024-02-22 19:09:22 -08:00
Debanjum Singh Solanky
57dce91c91 Document using venv to handle dependency conflict on khoj pip install
Resolves #276
2024-02-23 02:07:08 +05:30
Debanjum Singh Solanky
7271164256 Set chat session title to textContent of the chat session HTML element
We don't expect/want the user to use HTML titles for chat session
2024-02-23 02:07:08 +05:30
sabaimran
f8ec6b4464 Remove backslash for default route in api_chat 2024-02-20 20:09:44 -08:00
sabaimran
699545366b Set gunicorn config to use 4 workers 2024-02-20 15:06:20 -08:00
sabaimran
b1c86fee3b Release Khoj version 1.6.0 2024-02-20 14:12:24 -08:00
sabaimran
45c5a2598d Temp - change gunicorn config to use a single worker 2024-02-20 13:56:51 -08:00
sabaimran
44f8f20ea7 Miscellaneous bugs and fixes for chat sessions (#646)
* Display given_name field only if it is not None

* Add default slugs in the migration script

* Ensure that updated_at is saved appropriately, make sure most recent chat is returned for default history

* Remove the bin button from the chat interface, given deletion is handled in the drop-down menus

* Refresh the side panel when a new chat is created

* Improveme tool retrieval prompt, don't let /online fail, and improve parsing of extract questions

* Fix ending chat response by offline chat on hitting a stop phrase

Previously the whole phrase wouldn't be in the same response chunk, so
chat response wouldn't stop on hitting a stop phrase

Now use a queue to keep track of last 3 chunks, and to stop responding
when hit a stop phrase

* Make chat on Obsidian backward compatible post chat session API updates

- Make chat on Obsidian get chat history from
  `responseJson.response.chat' when available (i.e when using new api)
- Else fallback to loading chat history from
  responseJson.response (i.e when using old api)

* Fix detecting success of indexing update in khoj.el

When khoj.el attempts to index on a Khoj server served behind an https
endpoint, the success reponse status contains plist with certs. This
doesn't mean the update failed.

Look for :errors key in status instead to determine if indexing API
call failed. This fixes detecting indexing API call success on the
Khoj Emacs client, even for Khoj servers running behind SSL/HTTPS

* Fix the mechanism for populating notes references in the conversation primer for both offline and online chat

* Return conversation.default when empty list for dynamic prompt selection, send all cmds in telemetry

* Fix making chat on Obsidian backward compatible post chat session API updates

New API always has conversation_id set, not `chat' which can be unset
when chat session is empty.

So use conversation_id to decide whether to get chat logs from
`responseJson.response.chat' or `responseJson.response' instead

---------

Co-authored-by: Debanjum Singh Solanky <debanjum@gmail.com>
2024-02-20 13:55:35 -08:00
sabaimran
138f5223bd Fix process for generating embeddings for Notion entries (#648)
* Fix process for generating embeddings for Notion entries
* If no title field found, just log a warning and set the title to
2024-02-20 13:46:56 -08:00
Debanjum
43013c4fd4 Make Production Dependencies for Khoj Cloud Optional to Install (#647)
- Remove unused git dependency from Docker images
- Move python packages used for test into dev dependency group
- Only enable API token, Whatsapp cards on Web UI when Stripe, Twilio setup
- Move production dependencies to prod python packages group
- Fix docs links in Khoj welcome chat message
2024-02-16 17:42:23 +05:30
Debanjum Singh Solanky
4696577636 Upgrade python dependencies 2024-02-16 17:41:09 +05:30
Debanjum Singh Solanky
4007c871ae Remove unused git dependency from Docker images 2024-02-16 17:41:09 +05:30
Debanjum Singh Solanky
e21a8530f3 Move used python packages for test into dev dependency group
The test dependency group was being used independently
2024-02-16 17:41:09 +05:30
Debanjum Singh Solanky
4722da9642 Only enable API token, Whatsapp cards on Web UI when Stripe, Twilio setup 2024-02-16 17:41:09 +05:30
Debanjum Singh Solanky
cf4a524988 Move production dependencies to prod python packages group
This will reduce khoj dependencies to install for self-hosting users

- Move auth production dependencies to prod python packages group
  - Only enable authentication API router if not in anonymous mode
  - Improve error with requirements to enable authentication when not in
    anonymous mode
2024-02-16 17:41:08 +05:30
Debanjum Singh Solanky
d7dbb715ef Fix docs links in khoj introductory chat message 2024-02-13 22:38:03 +05:30
sabaimran
32ec54172e Add additional personalization in Chat via Location, Username (#644)
* Add location metadata to chat history
* Add support for custom configuration of the user name
* Add region, country, city in the desktop app's URL for context in chat
* Update prompts to specify user location, rather than just location.
* Add location data to Obsidian chat query
* Use first word for first name, last word for last name when setting profile name
2024-02-13 17:05:13 +05:30
sabaimran
a3eb17b7d4 Have Khoj dynamically select conversation command(s) in chat (#641)
* Have Khoj dynamically select which conversation command(s) are to be used in the chat flow
- Intercept the commands if in default mode, and have Khoj dynamically guess which tools would be the most relevant for answering the user's query
* Remove conditional for default to enter online search mode
* Add multiple-tool examples in the prompt, make prompt for tools more specific to info collection
2024-02-11 17:11:32 +05:30
sabaimran
69344a6aa6 Add support for multiple chat sessions in the desktop application (#639)
* Add chat sessions to the desktop application
* Increase width of the main chat body to 90vw
* Update the version of electron
* Render the default message if chat history fails to load
* Merge conversation migrations and fix slug setting
* Update the welcome message, use the hostURL, and update background color for chat actions
* Only update the window's web contents if the page is config
2024-02-11 16:05:28 +05:30
sabaimran
1412ed6a00 Support multiple chat sessions within the web UI (#638)
* Enable support for multiple chat sessions within the web client

- Allow users to create multiple chat sessions and manage them
- Give chat session slugs based on the most recent message
- Update web UI to have a collapsible menu with active chats
- Move chat routes into a separate file

* Make the collapsible side panel more graceful, improve some styling elements of the new layout

* Support modification of the conversation title

- Add a new field to the conversation object
- Update UI to add a threedotmenu to each conversation

* Get the default conversation if a matching one is not found by id
2024-02-11 15:48:28 +05:30
sabaimran
208ccc83ec Fix version of gpt4all to 2.1.0 as it's not backwards compatible 2024-02-10 09:32:04 +05:30
Debanjum Singh Solanky
70f74cde68 Fix timestamps to separate each logline. Info log response start time 2024-02-07 20:45:16 +05:30
Debanjum Singh Solanky
667b975400 Free space on Github workflow VM to build Khoj docker images 2024-02-06 23:37:51 +05:30
Debanjum Singh Solanky
8e5db72140 Release Khoj version 1.5.1 2024-02-06 23:09:33 +05:30
Debanjum
fc1b8f6fb6 Fix Khoj Obsidian plugin on Obsidian Mobile (#635)
- Removed node-fetch dependency to work on mobile. 
- Fix CORS issue for Khoj (streaming) chat on Obsidian mobile
- Verified Khoj plugin, search, chat work on Obsidian mobile.

## Details
### Major
- Allow calls to Khoj server from Obsidian mobile app to fix CORS issue
- Chat stream using default `fetch' not `node-fetch' in obsidian plugin

### Minor
- Load chat history after other elements in chat modal on Obsidian are rendered
- Scroll to bottom of chat modal on Obsidian across mobile & desktop
2024-02-06 22:03:51 +05:30
Debanjum
c6fa98ce3e Make Offline Chat Date Aware (#636)
- Provide more context and instructions to offline chat on Khoj
- Upgrade offilne chat quality tests to support more use-cases

### Details
- Improve offline chat system prompt to think step by step
- Make offline chat model current date aware. Improve system prompts
- Fix actor, director tests using freeze time by ignoring transformers package
2024-02-06 21:32:34 +05:30
Debanjum Singh Solanky
fd238ff792 Load chat history after other elements in chat modal on Obsidian rendered
This reduces laggy feeling due to latency of loading chat history from
server
2024-02-06 21:25:43 +05:30
Debanjum Singh Solanky
e06a0c6ae0 Scroll to bottom of chat modal on Obsidian across mobile & desktop
Put logic into single reused function
2024-02-06 21:25:43 +05:30
Debanjum Singh Solanky
07dc04f40e Allow calls to Khoj server from Obsidian mobile app to fix CORS issue
- Obsidian mobile uses capacitor js. Requests from it have origin as
  http://localhost on Android and capacitor://localhost on iOS
- Allow those Obsidian mobile origins in CORS middleware of server
2024-02-06 21:25:43 +05:30
Debanjum Singh Solanky
dd4cf66be1 Improve offline chat system prompt to think step by step 2024-02-06 20:23:19 +05:30
Debanjum Singh Solanky
035165b534 Make offline chat model current date aware. Improve system prompts
- Can now expect date awareness chat quality test to pass
- Prevent offline chat model from printing verbatim user Notes and
  special tokens
- Make it ask follow-up questions if it needs more context
2024-02-06 20:23:19 +05:30
Debanjum Singh Solanky
447904f0ab Chat stream using default fetch' not node-fetch' in obsidian plugin
Plugins using NodeJS libraries like `node-fetch' don't work on
Obsidian mobile
2024-02-06 03:03:42 +05:30
Debanjum Singh Solanky
0d949140f4 Fix actor, director tests using freeze time by ignoring transformers package
transformers package was causing freeze time to fail during setup
2024-02-06 03:00:48 +05:30
Debanjum Singh Solanky
c40f642afa Move Use OpenAI Compatible LLM Server section to existing advanced page
Add footnote on supported chat models to the self-hosting section
2024-02-04 16:16:55 +05:30
Debanjum Singh Solanky
523af5b3aa Fix docs. Chat model options need to be set if using OpenAI proxy server 2024-02-04 06:42:05 +05:30
Debanjum Singh Solanky
ba79334863 Only log number of day old user requests, not the complete dictionary 2024-02-02 10:33:31 +05:30
Debanjum Singh Solanky
474afa5efe Document using OpenAI-compatible LLM API server for Khoj chat
This allows using open or commerical, local or hosted LLM models that
are not supported in Khoj by default.

It also allows users to use other local LLM API servers that support
their GPU

Closes #407
2024-02-02 10:31:27 +05:30
Debanjum Singh Solanky
1c6f1d94f5 Fix styling of Whatsapp card & notify banner in config page of web app
- Put Whatsapp card back in Client section.
  - Fixes side spacing on cards
  - Improve Whatsapp card row gaps

- Hide notification banner on web app load. Previously it showed up as
  a yellow dot on smaller displays
2024-02-01 22:59:57 +05:30
Debanjum Singh Solanky
e05474e7e0 Say when max-prompt, tokenizer fields needs setup in self-host docs 2024-01-31 08:42:22 +05:30
sabaimran
4daac334bc Fix subscription state detection for users based on phone numbers, emails (#633)
* Fix subscription state detection for users based on phone numbers, emails
* Fix unit tests for api_user4
* Use a single method for determining subscription from user
* Pass user object, rather than user.email for getting subscription state
2024-01-31 07:48:55 +05:30
sabaimran
fc4b57d9f6 Revert styling for white-space pre-line in the chat views as it looks bad 2024-01-29 18:29:54 +05:30
sabaimran
da854703aa Release Khoj version 1.5.0 2024-01-29 18:05:10 +05:30
Debanjum
d1bfb245df Improve Khoj Chat and Settings UI (#630)
* Fix license in pyproject.toml. Remove unused utils.state import

* Use single debug mode check function. Disable telemetry in debug mode

- Use single logic to check if khoj is running in debug mode.
  Previously there were 3 different variants of the check

- Do not log telemetry if KHOJ_DEBUG is set to true. Previously didn't
log telemetry even if KHOJ_DEBUG set to false

* Respect line breaks in user, khoj chat messages to improve formatting

* Disable Whatsapp config section on web client if Twilio not configured

Simplify Whatsapp configuration status checking js by standardizing
external input to lower case

* Disable Phone API when Twilio not setup and rate limit calls to it

- Move phone api to separate router and only enable it if Twilio enabled
- Add rate-limiting to OTP and verification calls

* Add slugs for phone rate limiting

---------

Co-authored-by: sabaimran <narmiabas@gmail.com>
2024-01-29 18:03:43 +05:30
sabaimran
9ad44f0e77 Include info about privacy in the docs (#631)
* Add a page about privacy and organize some of the documentation
* Add notice about telemetry
* Improve copy for privacy section, link to telemetry section
2024-01-29 17:47:23 +05:30
sabaimran
4fb8d5c6d4 Store rate limiter-related metadata in the database for more resilience (#629)
* Store rate limiter-related metadata in the database for more resilience
- This helps maintain state even between server restarts
- Allows you to scale up workers on your service without having to implement sticky routing
* Make the usage exceeded message less abrasive
* Fix rate limiter for specific conversation commands and improve the copy
2024-01-29 15:27:06 +05:30
sabaimran
71cbe5160d Add retries in case the embeddings API fails (#628)
* Add retries in case the embeddings API fails
* Improve error handling in the inference endpoint API request handler
- retry only if HTTP exception
- use logger to output information about errors
2024-01-29 15:26:34 +05:30
sabaimran
b782683e60 Scrape results from Serper results using Olostep (#627)
* Initailize changes to incporate web scraping logic after getting SERP results
- Do some minor refactors to pass a symptom prompt to the openai model when making a query
- integrate Olostep in order to perform the webscraping
* Fix truncation error with new line, fix typing in olostep code
* Use the authorization header for the token
* Add a small hint/indicator for how to use Khojs other modalities in the welcome prompt
* Add more detailed error message if Olostep query fails
* Add unit tests which invoke Olostep in chat director
* Add test for olostep tool
2024-01-29 14:16:50 +05:30
sabaimran
360b59cdb2 Add handling for None field values in logs and make telemetry upload more frequent 2024-01-26 00:00:55 +05:30
sabaimran
737fb6417b Revert none checking in telemetry logs 2024-01-25 23:48:09 +05:30
sabaimran
211c5623e8 Improve error handling for telemetry uploads
- Use response.raise_for_status when telemetry upload files
- Do not send null packets to the destination server
2024-01-25 20:40:42 +05:30
Debanjum Singh Solanky
098a8e4fb1 Fix evaluating connected to server status in Obsidian plugin
Only show welcome status message when khojApiKey not set and khojUrl
set to khoj cloud
2024-01-25 18:04:29 +05:30
Debanjum Singh Solanky
518f3c0c99 Update docs to say khoj chat shown on obsidian ribbon now 2024-01-25 18:03:22 +05:30
Debanjum Singh Solanky
1c52ddf792 Bump up server side content indexing interval to ~1 day
Reduce server side indexing load and API request failures
2024-01-25 13:33:34 +05:30
sabaimran
0fba1e27c5 Add hint to input text for using slash commands 2024-01-25 11:56:56 +05:30
sabaimran
da6cd5ddc4 Improve subqueries for online search and prompt generation for image (#626)
* Improve subqueries for online search and prompt generation for image
- Include conversation history so that subqueries or intermediate prompts are generated with the appropriate context
2024-01-24 17:42:59 +05:30
sabaimran
dbdca7d8d1 Disable swagger UI docs in production 2024-01-24 15:23:39 +05:30
sabaimran
ddf6fd9c09 Remove valid number alert 2024-01-23 17:57:27 +05:30
Debanjum Singh Solanky
17107a0337 Release Khoj version 1.4.0 2024-01-23 10:18:31 +05:30
Debanjum Singh Solanky
f69eafe95a Update Readme with updated capabilties 2024-01-23 09:56:01 +05:30
sabaimran
679db51453 Add support for phone number authentication with Khoj (part 2) (#621)
* Allow users to configure phone numbers with the Khoj server

* Integration of API endpoint for updating phone number

* Add phone number association and OTP via Twilio for users connecting to WhatsApp

- When verified, store the result as such in the KhojUser object

* Add a Whatsapp.svg for configuring phone number

* Change setup hint depending on whether the user has a number already connected or not

* Add an integrity check for the intl tel js dependency

* Customize the UI based on whether the user has verified their phone number

- Update API routes to make nomenclature for phone addition and verification more straightforward (just /config/phone, etc).
- If user has not verified, prompt them for another verification code (if verification is enabled) in the configuration page

* Use the verified filter only if the user is linked to an account with an email

* Add some basic documentation for using the WhatsApp client with Khoj

* Point help text to the docs, rather than landing page info

* Update messages on various callbacks and add link to docs page to learn more about the integration
2024-01-22 18:14:58 -08:00
sabaimran
58bf917775 Update the font used across Khoj desktop and web to be Tajawal (#622) 2024-01-20 23:13:33 +05:30
Debanjum
679f0f24a4 Improve Chat Input Pane Actions. Move to 1 Click Audio Chat on Mobile (#624)
## Major
### Move to single click audio chat UX on Obsidian, Desktop, Web clients
  New default UX has 1 long-press on mobile, 2-click on desktop to send transcribed audio message
  - New Audio Chat Flow
    1. Record audio while microphone button pressed
    2. Show auto-send 3s countdown timer UI for audio chat message
        Provide a visual cue around send button for how long before audio
        message is automatically sent to Khoj for response
    3. Auto-send msg in 3s unless stop send message button clicked
  - Why
    - Removes the previous default of 3 clicks required to send audio message
       The record > stop > send process to send audio messages was unclear and effortful
    - Still allows stopping message from being sent, to make correction to transcribed audio
    - Removes inadvertent long audio transcriptions if forget to press stop while recording

### Improve chat input pane actions & icons on Obsidian. Desktop, Web clients
- Use SVG icons in chat footer on web, desktop app
- Move delete icon to left of chat input. This makes it harder to inadvertently click it
- Add send button to chat input pane
- Color chat message send button to make it primary CTA
- Make chat footer shorter. Use no or round border on action buttons

## Minor
- Stop rendering empty starter questions element when no questions present
- Add round border, hover color to starter questions in web, desktop apps
- Fix auto resizing chat input box when transcribed text added
- Convert chat input into a text area in the Obsidian client
2024-01-20 21:52:56 +05:30
Debanjum Singh Solanky
ec3b837d00 Send audio message in 2-clicks on desktop to avoid holding down mic button 2024-01-20 21:40:38 +05:30
Debanjum Singh Solanky
f0daa45ae0 Move to single click audio chat UX on Obsidian client
- Capabillity
  New default UX has 1 long-press to send transcribed audio message

  - Removes the previous default of 3 clicks required to send audio message
    - The record > stop > send process to send audio messages was unclear
  - Still allows stopping message from being sent, if users want to make
    correction to transcribed audio
  - Removes inadvertent long audio transcriptions if user forgets to
    press stop when recording

- Changes
  - Record audio while microphone button pressed
  - Show auto-send 3s countdown timer UI for audio chat message
    Provide a visual cue around send button for how long before audio
    message is automatically sent to Khoj for response
  - Auto-send msg in 3s unless stop send message button clicked
2024-01-20 16:07:12 +05:30
Debanjum Singh Solanky
29a581d2b0 Move to single click audio chat UX on desktop app
- Capabillity
  New default UX has 1 long-press to send transcribed audio message

  - Removes the previous default of 3 clicks required to send audio message
    - The record > stop > send process to send audio messages was unclear
  - Still allows stopping message from being sent, if users want to make
    correction to transcribed audio
  - Removes inadvertent long audio transcriptions if user forgets to
    press stop when recording

- Changes
  - Record audio while microphone button pressed
  - Show auto-send 3s countdown timer UI for audio chat message
    Provide a visual cue around send button for how long before audio
    message is automatically sent to Khoj for response
  - Auto-send msg in 3s unless stop send message button clicked
2024-01-20 16:03:51 +05:30
Debanjum Singh Solanky
699e9ff878 Move to single click audio chat UX on web app
- Capabillity
  New default UX has 1 long-press to send transcribed audio message

  - Removes the previous default of 3 clicks required to send audio message
    - The record > stop > send process to send audio messages was unclear
  - Still allows stopping message from being sent, if users want to make
    correction to transcribed audio
  - Removes inadvertent long audio transcriptions if user forgets to
    press stop when recording

- Changes
  - Record audio while microphone button pressed
  - Show auto-send 3s countdown timer UI for audio chat message
    Provide a visual cue around send button for how long before audio
    message is automatically sent to Khoj for response
  - Auto-send msg in 3s unless stop send message button clicked
2024-01-20 15:56:46 +05:30
Debanjum Singh Solanky
26bd3533d8 Stop rendering empty starter questions element when no questions present 2024-01-20 11:39:58 +05:30
Debanjum Singh Solanky
7c8c475c3a Add round border, hover color to starter questions in web, desktop apps 2024-01-20 00:51:11 +05:30
Debanjum Singh Solanky
8a488b9e39 Fix auto resizing chat input box when transcribed text added 2024-01-20 00:48:56 +05:30
Debanjum Singh Solanky
07ca137bdf Convert chat input into a text area in the Obsidian client
This allows for better readability of multi-line messages by users.
The chat input is a text area in the other clients as well.
2024-01-20 00:48:56 +05:30
Debanjum Singh Solanky
d4552117f6 Add and improve chat input pane, actions, icons on Obsidian client
- Move delete icon to left of chat input. This makes it harder to
  inadvertently click
- Add send button to chat footer. Enter being the only way to send
  messages is not intuitive, outside standard modern UI patterns
- Color chat message send button to make it primary CTA on web client
- Make chat footer shorter. Use no or round border on action buttons
2024-01-20 00:48:56 +05:30
Debanjum Singh Solanky
c0ad64d9a3 Add and improve chat input pane, actions, icons on desktop client
- Use SVG icons in chat footer on web
- Move delete icon to left of chat input. This makes it harder to
  inadvertently click
- Add send button to chat footer. Enter being the only way to send
  messages is not intuitive, outside standard modern UI patterns
- Color chat message send button to make it primary CTA on web client
- Make chat footer shorter. Use no or round border on action buttons
2024-01-20 00:29:49 +05:30
Debanjum Singh Solanky
ea85ebdacb Add and improve chat input pane, actions, icons on web client
- Use SVG icons in chat footer on web
- Move delete icon to left of chat input. This makes it harder to
  inadvertently click
- Add send button to chat footer. Enter being the only way to send
  messages is not intuitive, outside standard modern UI patterns
- Color chat message send button to make it primary CTA on web client
- Make chat footer shorter. Use no or round border on action buttons
2024-01-19 20:40:42 +05:30
sabaimran
039ed78253 Add support for a first-party client app to call into Khoj (Part 1) (#601)
* Add support for a first party client app
- Based on a client id and client secret, allow a first party app to call into the Khoj backend with a phone number identifier
- Add migration to add phone numbers to the KhojUser object
* Add plus in front of country code when registering a phone number.
- Decrease free tier limit to 5 (from 10)
- Return a response object when handling stripe webhooks
* Fix telemetry method which references authenticated user's client app
* Add better error handling for null phone numbers, simplify logic of authenticating user
* Pull the client_secret in the API call from the authorization header
* Add a migration merge to resolve phone number and other changes
2024-01-18 19:24:14 +05:30
Debanjum Singh Solanky
9dfe1bb003 Fix updating subscription when invoice paid. Revert renewal_date logic
The actual issue was that `get_or_create_user_by_email' tried to
create a subscription even if it already existed.

With updated logic:
- New subscription is only created when it doesn't already exist in
  `get_or_create_user_by_email'
- `set_user_subscription' just updates the subscription state as
  user subscription object creation is already managed by
  `get_or_create_user_by_email'. So the other conditionals are
  unnecessary
2024-01-18 16:20:18 +05:30
Debanjum Singh Solanky
9b1a66c969 Fix updating subscription renewal date when invoice paid 2024-01-18 14:46:10 +05:30
sabaimran
93d5cb128c Initialize embeddings to empty list before processing 2024-01-18 13:27:04 +05:30
Debanjum Singh Solanky
24af888c41 Release Khoj version 1.3.0 2024-01-18 11:42:13 +05:30
Debanjum Singh Solanky
2f1bb5c2c8 Upload Desktop App Artifacts to Github Release 2024-01-18 11:40:04 +05:30
sabaimran
e71ebb8068 Standardize issue templates and make them easier to use 2024-01-18 10:54:05 +05:30
sabaimran
efb4bd6780 Add a template for feature requests 2024-01-18 10:38:53 +05:30
sabaimran
6165ae56c2 Update bug report issue template
- collect info about OS, device, server, client, and prompt to include any relevant data
2024-01-18 10:35:02 +05:30
Debanjum
8b4dd16255 Fix markdownRenderer arg to allow chat responses in Obsidian plugin (#619)
- Issue
Users with Dataview plugin would have error as its markdown
post-processor expects the sourcePath to be a string

This prevents Khoj from responding to chat messages in the Obsidian
chat modal. Search via Obsidian still works but it throws the same
dataview plugin error

- Fix
Pass a string as sourcePath to markdownRenderer to fix failing chat response
and stop throwing dataview errors on search

Resolves #614, Resolves #606
2024-01-18 10:18:31 +05:30
Debanjum
c8dbe8ee7b Improve server status check and message in Obsidian client (#617)
- Update health API to pass authenticated users their info
- Improve Khoj server status check in Khoj Obsidian client
- Show Khoj Obsidian commands even if no connection to server
- Show Khoj chat by default in Obsidian side pane instead of search
2024-01-18 10:17:35 +05:30
Debanjum Singh Solanky
f9420e1209 Show Khoj Obsidian commands even if no connection to server
Server connection check can be a little flaky in Obsidian. Don't gate
the commands behind it to improve usability of Khoj.

Previously the commands would get disabled when server connection
check failed, even though server was actually accessible
2024-01-18 10:09:20 +05:30
Debanjum Singh Solanky
36bf42a860 Show Khoj chat by default in Obsidian side pane instead of search 2024-01-18 10:09:20 +05:30
Debanjum Singh Solanky
aab75a6ead Improve Khoj server status check in Khoj Obsidian client
- Update server connection status on every edit of khoj url, api key in
  settings instead of only on plugin load

  The error message was stale if connection fixed after changes in
  Khoj plugin settings to URL or API key, like on plugin install

- Show better welcome message on first plugin install.
  Include API key setup instruction

- Show logged in user email on Khoj settings page
2024-01-18 10:09:20 +05:30
Debanjum Singh Solanky
1a46734485 Fix markdownRenderer arg to allow chat responses in Obsidian plugin
- Issue: Users with Dataview plugin would have error as its markdown
post-processor expects the sourcePath to be a string

This prevents Khoj from responding to chat messages in the Obsidian
chat modal. Search via Obsidian still works but it throws the same
dataview error

- Fix: Pass a string as sourcePath to markdownRenderer to fix
failing chat response

Resolves #614, Resolves #606
2024-01-18 10:02:50 +05:30
sabaimran
e9e49ea098 Allow custom inference endpoint for the crossencoder model (#616)
* Add support for custom inference endpoints for the cross encoder model
- Since there's not a good out of the box solution, I've deployed a custom model/handler via huggingface to support this use case.
* Use langchain.community for pdf, openai chat modules
* Add an explicit stipulation that the api endpoint for crossencoder inference should be for huggingface for now
2024-01-18 10:02:12 +05:30
Debanjum Singh Solanky
08012c71b1 Update Dockerfile with swig system package required by PyMuPDF 2024-01-17 19:24:27 +05:30
Debanjum Singh Solanky
870af19ba4 Update health API to pass authenticated users their info
This allows Khoj clients to get email address associated with
user's API token for display in client UX

In anonymous mode, default user information is passed
2024-01-17 13:38:57 +05:30
Debanjum
4d30f7d1d9 Short-circuit API rate limiter for unauthenticated users (#607)
### Major
- Short-circuit API rate limiter for unauthenticated user
  Calls by unauthenticated users were failing at API rate limiter as it
  failed to access user info object. This is a bug.
  
  API rate limiter should short-circuit for unauthenicated users so a
  proper Forbidden response can be returned by API
  
  Add regression test to verify that unauthenticated users get 403
  response when calling the /chat API endpoint
  
### Minor
- Remove trailing slash to normalize khoj url in obsidian plugin settings
- Move used /api/config API controllers into separate module
- Delete unused /api/beta API endpoint
- Fix error message rendering in khoj.el, khoj obsidian chat
- Handle deprecation warnings for subscribe renew date, langchain, pydantic & logger.warn
2024-01-17 00:59:52 +05:30
Debanjum Singh Solanky
d26a4ffcea Only run the OpenAI chat client, /online test when API keys are set 2024-01-17 00:36:03 +05:30
Debanjum Singh Solanky
2752e0d607 Update jinja2 and axios min supported package versions 2024-01-16 18:45:38 +05:30
Debanjum Singh Solanky
7039c202c8 Merge branch 'master' into short-circuit-api-rate-limiter 2024-01-16 18:18:34 +05:30
Debanjum Singh Solanky
8917228dbb Remove unused, deprecated /api/config/data API endpoints
- Use /api/health for server up check instead of api/config/default
- Remove unused `khoj--post-new-config' method
- Remove the now unused /config/data GET, POST API endpoints
2024-01-16 18:15:06 +05:30
Debanjum
51c59d0059 Remove the 1000 files limit when syncing from Desktop, Obsidian clients (#605)
### Major
- Push 1000 files at a time from the Desktop client for indexing
- Push 1000 files at a time from the Obsidian client for indexing
- Test 1000 file upload limit to index/update API endpoint

### Minor
- Show relevant error message in desktop app, e.g when can't connect to server
- Pass indexed filenames in API response for client validation
- Collect files to index in single dict to simplify index/update controller

Resolves #573
2024-01-16 17:59:26 +05:30
Debanjum Singh Solanky
6ded4c1d75 Merge branch 'master' into fix-1000-file-index-update-limit 2024-01-16 16:50:58 +05:30
sabaimran
c24389cff5 Add Algolia to documentation website for better search 2024-01-16 15:53:53 +05:30
Debanjum
45f892dfdd Fix Offline Chat without GPU and Decoding Chat Query before Processing
- Only run /online command offline chat director test when `SERPER DEV_API_KEY' present
- Decode URL encoded query string in chat API endpoint before processing
- Make references and online_results optional params to converse_offline
- Pass max context length to fix using updated `GPT4All.list_gpu' method
2024-01-16 14:53:34 +05:30
Debanjum Singh Solanky
e0b381d523 Only run /online command offline chat director test when SERPER KEY present 2024-01-16 13:09:38 +05:30
Debanjum Singh Solanky
16175137e5 Decode URL encoded query string in chat API endpoint before processing 2024-01-16 13:09:28 +05:30
Debanjum Singh Solanky
9fe1c8ae13 Make references and online_results optional params to converse_offline
Fixes all the failing GPT4All tests because they were missing the
online_results argument
2024-01-16 13:09:28 +05:30
Debanjum Singh Solanky
d74f8e03d3 Pass max context length to fix using updated GPT4All.list_gpu method
It's signature was updated in GPT4All 2.1.0 pypi release.

Resolves #610
2024-01-16 12:23:45 +05:30
Debanjum Singh Solanky
1ae6669fbf Correctly handle API response when no files to index 2024-01-16 11:57:40 +05:30
sabaimran
50575b749b Add option to use HuggingFace's inference endpoint for generating embeddings (#609)
* Support using hosted Huggingface inference endpoint for embeddings generation
* Since the huggingface inference endpoint is model-specific, make the URL an optional property of the search model config
* Handle ECONNREFUSED error in desktop app
* Drive API key via the search model config model and use more generic names
2024-01-16 08:58:24 +05:30
Debanjum Singh Solanky
ba37b28fb5 Improve batched error handling. Catch can't connect to server error
Break out of batch processing when unable to connect to server or
when requests throttled by server
2024-01-14 01:04:44 +05:30
Debanjum Singh Solanky
7dfbcd2e5a Handle subscribe renew date, langchain, pydantic & logger.warn warnings
- Ensure langchain less than 0.2.0 is used, to prevent breaking
  ChatOpenAI, PyMuPDF usage due to their deprecation after 0.2.0
- Set subscription renewal date to a timezone aware datetime
- Use logger.warning instead of logger.warn as latter is deprecated
- Use `model_dump' not deprecated dict to get all configured content_types
2024-01-12 01:46:52 +05:30
Debanjum Singh Solanky
5f97357fe0 Delete unused /api/beta API endpoint 2024-01-12 01:11:05 +05:30
Debanjum Singh Solanky
bb1c1b39d8 Move /api/config API controllers into separate module for code modularity 2024-01-12 01:11:04 +05:30
Debanjum Singh Solanky
ba99089a12 Short-circuit API rate limiter for unauthenticated user
Calls by unauthenticated users were failing at API rate limiter as it
failed to access user info object. This is a bug.

API rate limiter should short-circuit for unauthenicated users so a
proper Forbidden response can be returned by API

Add regression test to verify that unauthenticated users get 403
response when calling the /chat API endpoint
2024-01-12 00:23:50 +05:30
Debanjum Singh Solanky
b1269fdad2 Remove trailing slash to normalize khoj url in obsidian plugin settings 2024-01-11 21:56:36 +05:30
Debanjum Singh Solanky
ffdb291fe0 Fix error message rendering in khoj.el, khoj obsidian chat
- Fix failed to index error message in khoj.el
- Fix chat model not configured message in khoj obsidian chat
2024-01-11 21:55:54 +05:30
Debanjum Singh Solanky
af9ceb00a0 Show relevant error msg in desktop app, e.g when can't connect to server 2024-01-09 23:09:34 +05:30
Debanjum Singh Solanky
43423432ce Pass indexed filenames in API response for client validation 2024-01-09 23:09:34 +05:30
Debanjum Singh Solanky
5f9ac5a630 Collect files to index in single dict to simplify index/update controller
Simplifies code while maintaining typing
2024-01-09 23:09:34 +05:30
Debanjum Singh Solanky
efe41aaaca Push 1000 files at a time from the Desktop client for indexing
FastAPI API endpoints only support uploading 1000 files at a time.
So split all files to index into groups of 1000 for upload to
index/update API endpoint
2024-01-09 23:09:34 +05:30
sabaimran
02187b19bb Customize font styling for documentation 2024-01-08 08:50:42 +05:30
sabaimran
8389108653 Fix reference issue for demos in the main README 2024-01-08 08:29:51 +05:30
Debanjum
dbc59b2952 Fix, Improve Khoj Documentation Layout (#604)
- 26f96e00 Use Khoj Client, Data sources diagrams in feature docs
- c82d34b6 Add Docs footer, nav pane links. Fix tagline, Remove announcement topbar
- d920e4d0 Make the docs overview page as the main docs landing page
- 80d1ad5b Fix image urls on docs overview page. Remove logo header in client docs
2024-01-08 02:00:02 +05:30
Debanjum Singh Solanky
efc7b08cd9 Use Khoj Client, Data sources diagrams in feature docs 2024-01-08 01:58:57 +05:30
Debanjum Singh Solanky
c82d34b659 Add Docs footer, nav pane links. Fix tagline, Remove announcement topbar 2024-01-08 01:17:47 +05:30
Debanjum Singh Solanky
d920e4d0a7 Make the docs overview page as the main docs landing page
- Make the docs overview page available at docs.khoj.dev root instead of
under docs.khoj.dev/docs path
  - Remove the new landing page, it is unnecessary.
- Remove /docs path prefix from links to internal doc pages
- Remove .md path suffix in internal doc pages for consistency
2024-01-08 01:13:42 +05:30
Debanjum Singh Solanky
80d1ad5b6f Fix image urls on docs overview page. Remove logo header in client docs 2024-01-08 00:30:31 +05:30
sabaimran
ce53bc52c5 Modify permissions of the GITHUB_TOKEN for publishing to gh-pages 2024-01-07 20:53:57 +05:30
sabaimran
740453fa18 Use documentation folder for building project and uploading data 2024-01-07 20:50:15 +05:30
sabaimran
2be7c84203 Enter documentation repository before running yarn build 2024-01-07 20:46:21 +05:30
sabaimran
ad95e88838 Update node version in github action 2024-01-07 20:41:24 +05:30
sabaimran
bd9aa578f4 Add a yarn.lock file and use for node.js setup 2024-01-07 20:36:02 +05:30
sabaimran
9b991eb4fe Migrate to using docusaurus, rather than docsify for documentation (#603)
* Add docusaurus documentation (to replace the docsify setup
* Remove older docs
* Specify documentation as the gh pages build action working directory
2024-01-07 20:28:15 +05:30
Debanjum Singh Solanky
98081bc0d3 Update Uninstall Documentation for Khoj Server when Self Hosting 2024-01-06 01:37:29 +05:30
Debanjum Singh Solanky
5d52dc5b35 Fix spelling in the development documentation for Khoj 2024-01-04 19:24:58 +05:30
Debanjum Singh Solanky
b6d5392c0c Release Khoj version 1.2.1 2024-01-04 18:45:37 +05:30
Debanjum Singh Solanky
fca7a5ff32 Push 1000 files at a time from the Obsidian client for indexing
FastAPI API endpoints only support uploading 1000 files at a time.
So split all files to index into groups of 1000 for upload to
index/update API endpoint
2024-01-04 18:43:22 +05:30
Debanjum Singh Solanky
4ded32cc64 Test 1000 file upload limit to index/update API endpoint
Due to FastAPI limitation
2024-01-03 22:14:36 +05:30
Debanjum Singh Solanky
4a234c8db3 Use default offline/openai chat model to extract DB search queries
Make usage of the first offline/openai chat model as the default LLM
to use for background tasks more explicit

The idea is to use the default/first chat model for all background
activities, like user message to extract search queries to perform.
This is controlled by the server admin.

The chat model set by the user is used for user-facing functions like
generating chat responses
2024-01-03 14:04:49 +05:30
Debanjum Singh Solanky
e28adf2884 Also index pdf, markdown and plaintext files using khoj emacs client
Previously you could only index org-mode files and directories from
khoj.el

Mark the `khoj-org-directories', `khoj-org-files' variables for
deprecation, since `khoj-index-directories', `khoj-index-files'
replace them as more appropriate names for the more general case

Resolves #597
2024-01-03 11:46:17 +05:30
Debanjum Singh Solanky
5abaed9d08 Use user chosen OpenAI model to extract DB search questions from query
Previously Khoj was selecting the first OpenAI model configured on
server and not the OpenAI model configured by the user for themselves
2024-01-03 11:45:06 +05:30
Debanjum Singh Solanky
e582639efa Move contributing section back down in sidebar of documentation website 2024-01-03 11:40:14 +05:30
Debanjum Singh Solanky
05536aab6b Merge how users can share personal information in personality prompt 2024-01-03 11:40:14 +05:30
Liam Swayne
455f78b178 Replace var declarations with let declarations (#576)
* Replace var declaration with let declaration
2023-12-29 10:20:48 +05:30
sabaimran
79913d4c17 Add isort to the pre-commit configuration and apply it to the whole project (#595)
* Apply isort to the entire repository
* Fix missing import issues in text_to_entries
* Fix imports in migration files
2023-12-28 18:04:02 +05:30
sabaimran
738f050086 Merge pull request #587 from khoj-ai/features/search-model-options-custom
Support multiple search models, with ability for custom user config
2023-12-28 13:09:49 +05:30
sabaimran
442c913de3 Update telemetry state for search model only if one is found, fix alt text for language setting 2023-12-28 12:53:53 +05:30
sabaimran
d3ab3f1b70 Rename matrix_blog to web and move the language setting into the content section 2023-12-28 12:44:49 +05:30
sabaimran
6946e038c2 Merge pull request #596 from khoj-ai/chore/add-developer-documentation
Improve the developer documentation
2023-12-23 18:43:43 +05:30
sabaimran
00af6baeb6 Resolve merge conflicts with intro message in chat.html web view 2023-12-23 17:52:58 +05:30
sabaimran
c10602b6c5 Put contributing higher in the sidebar 2023-12-23 14:04:53 +05:30
sabaimran
fe415e1508 Add tip for using the good-first-issue tag in GH issues 2023-12-23 14:04:05 +05:30
sabaimran
3280715ca0 Update contributor guidelines
- Add more accurate steps for building Khoj locally
- Remove outdated instructions
- Add specific steps to create a Github Issue
- Add instructions for Obsidian plugin development
2023-12-23 14:00:52 +05:30
sabaimran
afec4394f9 Merge pull request #592 from ayushjha119/Fixed-Health-Check-to-Khoj-api
Fixed health check to khoj api
2023-12-23 13:04:50 +05:30
sabaimran
c50eb8a691 Fix mypy/pre-commit issues 2023-12-23 11:44:37 +05:30
Debanjum Singh Solanky
21c55b4c0d Release Khoj version 1.2.0 2023-12-22 21:43:47 +05:30
Debanjum Singh Solanky
e42111a8af Fix bump_version.sh to commit, clean-up after desktop app version bump 2023-12-22 21:42:03 +05:30
Debanjum Singh Solanky
6a8c1fe423 Sanitize rendering chat references in Web, Desktop and Obsidian clients
Use textContent instead of innerHTML to append references

Resolves #583
2023-12-22 18:11:49 +05:30
Debanjum
6879daccc6 Fix Chat Streaming on Obsidian, Docker Image Version and First-Run, Chat Error Messages in Clients (#589)
- Fix streaming chat response in Obsidian client
- Fix first-run, chat error message in obsidian, desktop and web clients
- Set Khoj app version to latest version in Docker images
- Tag Khoj Docker image built on release with the `latest` tag
   This align docker image release cadence with client, server releases
2023-12-22 04:13:01 -08:00
Debanjum Singh Solanky
074123b9b9 Merge cloud, local dockerize workflows
- Delete unused config directory
2023-12-22 17:11:52 +05:30
Debanjum Singh Solanky
d101297995 Use markdown formatted chat message in chat modal 2023-12-22 17:01:31 +05:30
Debanjum Singh Solanky
350fd89c8d Clear chat history html in Obsidian if getChatHistory works too 2023-12-22 17:01:31 +05:30
Debanjum Singh Solanky
8d1e988059 Update tagging of the docker image on release, push to master & PR
- Tag docker image with `tag_name' on release (i.e tag push)
- Else tag with 'pre' on push to master
- Else tag with 'dev' on push to PR branch

- Only tag the latest release with release tag
  Previously the latest commit on master was being tagged with the
  latest tag. This doesn't sync with the release cadence of the rest
  of Khoj
2023-12-22 17:01:31 +05:30
Debanjum Singh Solanky
b5ae64cb3c Dynamically set Khoj app version in the Dockerization Github workflows 2023-12-22 17:01:31 +05:30
Debanjum Singh Solanky
d3d47dce0b Allow setting Khoj app version during docker build via build-args
This will allow troubleshooting by getting the actual khoj version
being used. Previously it was always set to a static 0.0.0 version

Command to build Khoj docker image with dynamically set current app version:
`docker-compose build server --build-arg VERSION=$(pipx run hatch version)'
2023-12-22 16:47:13 +05:30
ayushjha119
e487ec5370 fixed app to api health Check 2023-12-21 17:51:30 +05:30
Debanjum Singh Solanky
70607cbbbb Update FRE message to get any Khoj client to sync files with server 2023-12-21 15:23:47 +05:30
ayushjha119
b3d7d6a79d used the Response class from fastapi.responses and set the input for status_code to 200 2023-12-21 14:26:40 +05:30
sabaimran
e1aaff2053 Add more details about functionality in Khoj's intro message 2023-12-21 10:09:30 +05:30
sabaimran
a1211f40d7 Fix type declaration for the cross_encoder_model state variable. Update name of the new update API 2023-12-21 09:15:13 +05:30
sabaimran
089e4bee12 FIx unit tests with new search model configurations 2023-12-20 21:50:44 +05:30
Debanjum Singh Solanky
447c1b90e7 Fix streaming chat response in Obsidian client
- Convert renderIncrementalMessage to an async method as
  MarkdownRenderer is an async method

- Simplify code, remove unneeded JSON check
2023-12-20 14:51:19 +05:30
sabaimran
aa23da60a3 Add a notification banner to show temporary messages 2023-12-20 14:22:08 +05:30
Debanjum Singh Solanky
e04fe921eb Fix first-run, chat error message in obsidian, desktop and web clients
- Disable chat input field if getChatHistory had error as Khoj may not
  be setup correctly to chat
2023-12-20 14:03:07 +05:30
sabaimran
5ff9df9d4c Add support per user for configuring the preferred search model from the config page
- Honor this setting across the relevant places where embeddings are used
- Convert the VectorField object to have None for dimensions in order to make the search model easily configurable
2023-12-20 13:25:43 +05:30
sabaimran
0f6e4ff683 Add a model that specifies the user's search model configuration
- Update all endpoints that generate embeddings to use the new model. Incl. generating text embeddings, creating embeddings for a search query
2023-12-20 09:22:26 +05:30
sabaimran
6dd2b05bf5 Rebase with master 2023-12-19 21:02:49 +05:30
sabaimran
e3557cd8b7 Update the personality prompt to make Khoj aware that users can share data via the desktop app 2023-12-19 16:42:45 +05:30
sabaimran
927e477f68 Ignore typing error in custom action short description 2023-12-19 16:10:58 +05:30
sabaimran
946305d977 Add function to export conversations for debugging 2023-12-19 16:05:20 +05:30
sabaimran
903a01745f Use 0px for padding for input row buttons in web 2023-12-18 16:09:06 +05:30
sabaimran
1e14a24f06 Merge pull request #586 from khoj-ai/features/misc-image-and-online-improvements
Improvements to chat functionality and image generation
2023-12-17 23:28:08 +05:30
sabaimran
5b092d59f4 Ignore dict assignment typing error 2023-12-17 22:34:54 +05:30
sabaimran
03cb86ee46 Update typing and object assignment for new text to image method return 2023-12-17 21:28:33 +05:30
sabaimran
0288804f2e Render the inferred query along with the image that Khoj returns 2023-12-17 21:02:55 +05:30
sabaimran
49af2148fe Miscellaneous improvements to image generation
- Improve the prompt before sending it for image generation
- Update the help message to include online, image functionality
- Improve styling for the voice, trash buttons
2023-12-17 20:25:35 +05:30
sabaimran
7cb64cb2f9 Add telemetry for image generation conversation command 2023-12-17 18:25:03 +05:30
sabaimran
e9ea0195b0 Merge pull request #585 from khoj-ai/fix/image-generation-and-csrf-cookie
Fix image generation setup bug and CSRF cookie for admin login
2023-12-17 16:55:45 +05:30
sabaimran
09544dee09 Add TextToImageModelConfig to the admin page 2023-12-17 16:44:19 +05:30
sabaimran
0459666beb CSRF Cookie not set error in prod. Try fixing https forwarding for mitigation 2023-12-17 12:55:18 +05:30
sabaimran
61dde8ed89 If text to image config isn't set, send back an error message to the client 2023-12-17 12:54:50 +05:30
sabaimran
fefaa2271d Merge pull request #584 from khoj-ai/features/enforce-usage-limits-conversation-type
Add a ConversationCommand rate limiter for the chat endpoint
2023-12-17 11:20:35 +05:30
sabaimran
3065cea562 Address mypy typing issues 2023-12-16 09:24:26 +05:30
sabaimran
5f6dcf9f2e Add a rate limiter for the transcribe API endpoint 2023-12-16 09:18:56 +05:30
sabaimran
73a107690d Add a ConversationCommand rate limiter for the chat endpoint 2023-12-16 09:03:52 +05:30
sabaimran
9b961ed496 Merge pull request #580 from khoj-ai/fix-upgrade-chat-to-create-images
Support Image Generation with Khoj
2023-12-07 21:17:58 +05:30
Debanjum Singh Solanky
7504669f2b Fix rendering image on chat response in obsidian client 2023-12-05 03:48:07 -05:00
Debanjum Singh Solanky
408b7413e9 Use global openai client for transcribe, image 2023-12-05 03:36:33 -05:00
Debanjum Singh Solanky
162b219f2b Throw unsupported error when server not configured for image, speech-to-text 2023-12-05 01:51:14 -05:00
Debanjum Singh Solanky
8f2f053968 Fix rendering image on chat response in web, desktop client 2023-12-05 01:51:14 -05:00
Debanjum Singh Solanky
d124266923 Reduce promise based nesting in chat JS func used in desktop, web client
Use async/await to reduce .then() based nesting to improve code
readability
2023-12-05 01:51:14 -05:00
Debanjum Singh Solanky
6e3f66c0f1 Use base64 encoded image instead of source URL for persistence
The source URL returned by OpenAI would expire soon. This would make
the chat sessions contain non-accessible images/messages if using
OpenaI image URL

Get base64 encoded image from OpenAI and store directly in
conversation logs. This resolves the image link expiring issue
2023-12-05 01:51:14 -05:00
Debanjum Singh Solanky
52c5f4170a Show generated images in the chat modal of the Khoj Obsidian plugin 2023-12-05 01:51:14 -05:00
Debanjum Singh Solanky
8016a57b5e Show generated images in chat interface on Desktop client 2023-12-05 01:51:14 -05:00
Debanjum Singh Solanky
cc051ceb4b Show generated images in chat interface on Web client 2023-12-05 01:51:14 -05:00
Debanjum Singh Solanky
252b35b2f0 Support /image slash command to generate images using the chat API 2023-12-05 01:51:14 -05:00
sabaimran
ef21d78c99 Initial changes to support multiple search model configurations
- All search models are loaded into memory, and stored in a dictionary indexed by name
- Still need to add database migrations and create a UI for user to select their choice. Presently, it uses the default option
2023-12-05 00:35:40 -05:00
Debanjum Singh Solanky
1d9c1333f2 Configure text to image models available on server
- Currently supports OpenAI text to image model, by default dall-e-3
- Allow setting the text to image model via CLI during server setup
2023-12-04 21:27:53 -05:00
Debanjum Singh Solanky
f0222f6d08 Make save_to_conversation_log helper function reusable
- Move it out to conversation.utils from generate_chat_response function
- Log new optional intent_type argument to capture type of response
  expected. This can be type responses by Khoj e.g speech, image. It
  can be used to render responses by Khoj appropriately on clients
- Make user_message_time argument optional, set the time to now by
  default if not passed by calling function
2023-12-04 19:42:12 -05:00
sabaimran
d2ddbef08f Use a unique name for the temp PDF generated 2023-12-04 19:27:00 -05:00
sabaimran
d20746613a Properly filter out empty PDFs for indexing 2023-12-04 16:15:17 -05:00
Debanjum Singh Solanky
316b7d471a Handle offline chat model retrieval when no internet
Offline chat shouldn't fail on retrieve_model when no internet,
if model was previously downloaded and usable offline
2023-12-04 13:46:25 -05:00
Debanjum Singh Solanky
2b09caa237 Make online results an optional argument to the gpt converse method 2023-12-04 12:15:29 -05:00
Debanjum Singh Solanky
7009793170 Migrate to OpenAI Python library >= 1.0 2023-12-03 18:16:00 -05:00
sabaimran
62a89f79b7 Merge pull request #577 from khoj-ai/fix/user-subscription-email-not-exists
Fix null exception when user does not exist for subscription
2023-12-03 15:14:31 -08:00
sabaimran
cc064ea57d Fix circular import issue 2023-12-03 17:46:44 -05:00
sabaimran
21f8d63e89 If a user subscribes to Khoj with an email address that's not present in the DB, create an account 2023-12-03 17:28:40 -05:00
sabaimran
c5d297a9ed Recursively search through folders for indexing 2023-12-03 16:17:28 -05:00
Debanjum Singh Solanky
a57d529f39 Fix path to system tray icon of Khoj desktop app 2023-12-03 00:12:50 -08:00
Debanjum Singh Solanky
106cdbe455 Release Khoj version 1.1.0 2023-11-30 20:09:08 -08:00
Debanjum Singh Solanky
10ce4ee11c Ignore null params type check for markdown renderer in Obsidian client 2023-11-30 20:09:08 -08:00
Debanjum
02f40785aa Merge Github workflows to dockerize for production (#575) 2023-11-30 18:49:16 -08:00
sabaimran
a5ffa2342f Add documentation for local setup and fix admin panel bugs
- Wasn't able to login to the admin panel when KHOJ_DEBUG was not True. Fix this error so self-hosted users can get unblocked from accessing the admin settings
- Don't force users to set their KHOJ_DJANGO_SECRET_KEY
2023-11-30 17:55:27 -08:00
Debanjum Singh Solanky
9d4bfdf47c Merge Github workflows to dockerize for production 2023-11-30 17:18:13 -08:00
Debanjum Singh Solanky
d587632700 Clear result before render thinking placeholder emoji in Obsidian chat 2023-11-30 13:53:09 -08:00
Debanjum
a0686428ff Render Chat Responses as Markdown in Desktop, Obsidian Client (#571)
- Show temporary status message when copied to clipboard
- Render chat responses as markdown in Desktop client
- Render chat responses as markdown in chat modal of Obsidian client
- Render references of new responses in chat modal on Obsidian client. Use new style for references
- Properly stop `mediaRecorder` stream to clear microphone in-use state
- Render newlines when references expanded in Web, Desktop and Obsidian clients
2023-11-30 13:52:02 -08:00
Debanjum Singh Solanky
48719ee0dd Render newline separation in chat references to improve readability 2023-11-30 13:16:48 -08:00
Debanjum Singh Solanky
1a31a2efcf Render Khoj chat streaming response as md & show refs in Obsidian
- Use new style references for Khoj chat modal in Obsidian
- Khoj Chat responses in Obsidian had regressed to not show references
  for new questions after modal has been opened. Now even those are
  rendered, and use new references style
- Render chat response as markdown while it's being streamed
2023-11-30 13:02:00 -08:00
Debanjum Singh Solanky
0430fa67b6 Show temporary status message when copied to clipboard 2023-11-29 13:49:33 -08:00
Debanjum Singh Solanky
491a1a949a Render chat responses as markdown in Desktop client too 2023-11-29 13:49:33 -08:00
Debanjum Singh Solanky
20ef5bfc93 Properly stop mediaRecorder stream to clear microphone in-use state 2023-11-29 13:48:35 -08:00
Debanjum Singh Solanky
8faa63c3c6 Convert config page buttons to use stronger yellow 2023-11-28 19:55:43 -08:00
Debanjum Singh Solanky
de5aa5c32e Update pillow, aiohttp dependencies 2023-11-28 19:55:43 -08:00
sabaimran
fab57cc395 Fix pgvector installation instructions for Windows, Source 2023-11-28 14:46:09 -08:00
sabaimran
c4dcb51c91 Update headings for installation steps to indicate that local and docker setup are exclusive 2023-11-28 14:38:04 -08:00
Debanjum Singh Solanky
a6ca2076d5 Open link to Khoj app landing page from nav pane in current tab 2023-11-28 14:20:37 -08:00
Debanjum Singh Solanky
643e018947 Handle if user subscription field doesn't exists in telemetry func
Avoid null ref in the method when running Khoj server in anon mode
2023-11-28 14:15:14 -08:00
Debanjum Singh Solanky
110d7646fc Use milder yellow as primary Khoj theme color for chat, buttons etc. 2023-11-28 14:15:14 -08:00
sabaimran
18254850ab Set a default value for the khoj django secret key and add additional guidance for setting environment variables on first run 2023-11-28 09:39:44 -08:00
sabaimran
24b5aaef0a Merge pull request #569 from khoj-ai/features/enforce-subscription-status
Enforce subscription state on the chat API access
2023-11-27 16:12:26 -08:00
sabaimran
6290b463f5 Compute size of the indexed data only if explicitly requested to avoid heavy load on the DB 2023-11-27 12:05:00 -08:00
sabaimran
eb5e3096e0 Change subscribed scope to premium 2023-11-27 11:39:20 -08:00
sabaimran
6e1ba11e59 Resolve merge conflicts for rendering chat response 2023-11-27 11:33:13 -08:00
sabaimran
239b31bc85 Clarify some of the langauge in the chat configuration docs 2023-11-27 10:44:05 -08:00
sabaimran
309ba7234c Add instructions for setting up chat settings when locally hosting Khoj 2023-11-27 10:41:29 -08:00
sabaimran
5d8dbbdba4 Update instructions for Windows setup and add prerequisites for Docker 2023-11-27 10:32:02 -08:00
Debanjum Singh Solanky
71f2d54258 Render chat response as markdown while streaming on Web, Desktop clients 2023-11-26 20:27:10 -08:00
Debanjum Singh Solanky
9e714d032b Fix Khoj telemetry server. Add server_version column 2023-11-26 15:05:43 -08:00
Debanjum
ebeae543ee Speak to Khoj via Desktop, Web or Obsidian Client (#566)
- Create speech to text API endpoint
- Use OpenAI Whisper for ASR offline (by downloading Whisper model) or online (via OpenAI API)
- Add speech to text model configuration to Database
- Speak to Khoj from the Web, Desktop or Obsidian client
2023-11-26 14:32:11 -08:00
Debanjum Singh Solanky
b249bbb5b5 Limit max audio file size allowed for transcription on API endpoint 2023-11-26 14:19:46 -08:00
sabaimran
e438853b09 Add additional unit tests to verify behavior of unsubscribed/subscribed users 2023-11-26 13:09:00 -08:00
sabaimran
c18d52d1af Add contributors to the README 2023-11-26 12:05:36 -08:00
Debanjum Singh Solanky
a79604b601 Fix return types of offline, online transcribe methods for python 3.9 2023-11-26 06:26:34 -08:00
Debanjum Singh Solanky
06f99ceb3c Rename /api/speak API endpoint to /api/transcribe 2023-11-26 06:18:44 -08:00
Debanjum Singh Solanky
56a1a61c77 Remove unused button element retrieval code from web, desktop 2023-11-26 06:17:56 -08:00
Debanjum Singh Solanky
877532a167 Speak to Khoj from the Obsidian client
- Add transcription button with mic icon
- Collect audio recording on pressing mic
- Process and send audio recording to server for transcription
- Extract the functionality to flash status in chat input for reuse
2023-11-26 06:17:54 -08:00
Debanjum Singh Solanky
cc9eae5d18 Update default chat model to Mistral in GPT4AllProcessor config 2023-11-26 05:55:43 -08:00
Debanjum Singh Solanky
4636390f7f Transcribe speech to text offline with Whisper
- Allow server admin to configure offline speech to text model during
  initialization
- Use offline speech to text model to transcribe audio from clients
- Set offline whisper as default speech to text model as no setup api key reqd
2023-11-26 05:55:11 -08:00
Debanjum Singh Solanky
a0a7ab7ec8 Rename conversation.gpt4all package to conversation.offline 2023-11-26 04:19:32 -08:00
Debanjum Singh Solanky
499adf86a0 Move transcription using OpenAI API into independent package 2023-11-26 04:19:32 -08:00
Debanjum Singh Solanky
897170ab15 Use single db migration script for transcribe model, related updates 2023-11-26 04:19:32 -08:00
Debanjum Singh Solanky
28090216f6 Show transcription error status in chatInput placeholder on web, desktop
- Extract flashing status message in chat input placeholder into
  reusable function
- Use emoji prefixes for status messages
- Improve alt text of transcribe button to indicate what the button does
2023-11-26 04:19:32 -08:00
Debanjum Singh Solanky
fc040825b2 Default to Offline chat with Mistral as minimal setup, no API key reqd. 2023-11-26 01:07:20 -08:00
Debanjum Singh Solanky
5a6547677c Add type of operation variable in latest migration 2023-11-26 00:38:52 -08:00
Debanjum Singh Solanky
3e252036c3 Remove whitespace: pre-line from chat html, since markdown rendering 2023-11-26 00:27:29 -08:00
Debanjum Singh Solanky
b484795b8e Merge branch 'master' into add-speak-to-chat
- Conflicts:
  - src/interface/desktop/chat.html
    Combine and use common class names for speak component
  - src/khoj/database/adapters/__init__.py
    Combine imports
  - src/khoj/interface/web/chat.html
    Combine and use common class names for speak component
  - src/khoj/routers/api.py
    Combine imports
2023-11-26 00:26:21 -08:00
sabaimran
6233a957b4 Merge branch 'master' of github.com:khoj-ai/khoj into features/enforce-subscription-status 2023-11-25 22:46:10 -08:00
sabaimran
52b88de7f4 Indicate in the desktop if the user gets rate limited for indexing 2023-11-25 22:31:23 -08:00
Debanjum
e0a59cff68 Delete Conversation History from Web, Desktop, Obsidian Clients (#551)
Add delete button to clear conversation history from Web, Desktop and Obsidian Khoj clients

Resolves #523
2023-11-25 22:24:12 -08:00
Debanjum Singh Solanky
d0e294d8a5 Clear Conversation History from the Obsidian client
- Fix font color for Khoj chat responses in Obsidian. Previous color
  had too low a contrast to be readable
2023-11-25 22:16:13 -08:00
sabaimran
73e38fccf3 Explicitly set billing to off in the test for being able to index a large set of data 2023-11-25 20:48:32 -08:00
sabaimran
b2afbaa315 Add support for rate limiting the amount of data indexed
- Add a dependency on the indexer API endpoint that rounds up the amount of data indexed and uses that to determine whether the next set of data should be processed
- Delete any files that are being removed for adminstering the calculation
- Show current amount of data indexed in the config page
2023-11-25 20:28:04 -08:00
Debanjum Singh Solanky
07bf365c7c Clear any network connections to khoj server via khoj.el on reindex
- Ignore errors in deleting network requests to khoj server
- Also delete open network connection to khoj server on auto reindex
  Otherwise when server is unreachable a bunch of failed network
  connections accrue in the processes list
2023-11-25 20:19:41 -08:00
sabaimran
dd1badae81 Use userwithtoken.user when authenticating with an API key 2023-11-24 22:18:45 -08:00
sabaimran
48b9116195 Fix to use user rather than user_with_token in authenticated credentials 2023-11-24 22:18:00 -08:00
sabaimran
771f9bcfa1 If the user subscription was created over 7 days ago, then their trial is expired 2023-11-24 22:08:32 -08:00
sabaimran
e5b1350523 Enforce API use limits depending on whether the server has billing enabled
and whether the given user is subscribed
2023-11-24 21:55:16 -08:00
sabaimran
9c868ee10b Use the state.billing_enabled field to determine whether to use the subscribed scope 2023-11-24 20:41:19 -08:00
sabaimran
69c8f45830 Use scopes to represent whether the use has a valid subscription in the middleware 2023-11-24 20:29:36 -08:00
Debanjum
25f3f2367e Handle Server Unavailable Error from Khoj.el (#568)
- Make auto-update of content index user configurable from khoj.el
- Handle server unavailable error on auto-index schedule job in khoj.el

Resolves #567
2023-11-24 16:46:07 -08:00
Debanjum Singh Solanky
138f4e3f3c Make auto-update of content index user configurable from khoj.el 2023-11-24 16:40:50 -08:00
Debanjum Singh Solanky
0885fc6c23 Handle server unavailable error on auto-index schedule job in khoj.el 2023-11-24 16:39:44 -08:00
sabaimran
c13953311a Add reflective questions to admin pages 2023-11-23 14:01:05 -08:00
sabaimran
c42ec32a95 Merge pull request #552 from khoj-ai/features/internet-enabled-search
Support internet-enabled, online searching using Serper.dev
2023-11-23 12:34:05 -08:00
sabaimran
e3b32e412c Merge pull request #556 from khoj-ai/features/reflective-suggested-questions
Add support for suggesting base questions to users
2023-11-23 11:57:02 -08:00
sabaimran
5fac39afed Fix PYTHONPATH reference in order to maintain appropriate package imports 2023-11-22 20:35:11 -08:00
sabaimran
c641b8df58 Update desktop package version 2023-11-22 17:54:53 -08:00
sabaimran
a1b2289074 Release Khoj version 1.0.1 2023-11-22 17:52:07 -08:00
sabaimran
e34db979b6 Add instructions for using the self hosted URL in clients 2023-11-22 17:32:43 -08:00
sabaimran
b1b037f0ea Fix URL configuration issues with reorganized subfolders 2023-11-22 17:03:33 -08:00
sabaimran
e0949e232b Import random in adapters file for selecting reflective question 2023-11-22 07:52:51 -08:00
sabaimran
256e8de40a Merge with features/internet-enabled-search 2023-11-22 07:25:24 -08:00
Debanjum Singh Solanky
fd60db766e Clear Conversation History from the Web Client 2023-11-22 03:35:00 -08:00
Debanjum Singh Solanky
d5a4830761 Clear Conversation History from the Desktop Client 2023-11-22 03:35:00 -08:00
Debanjum Singh Solanky
3096544cf2 Create API endpoint to clear user's chat history 2023-11-22 03:34:59 -08:00
Debanjum Singh Solanky
63675b3299 Speak to Khoj from the Desktop client
- Use icons to style speech to text recording state
2023-11-22 02:47:17 -08:00
Debanjum Singh Solanky
2951fc92d7 Speak to Khoj from the Web client
- Use icons to style speech to text recording state
2023-11-22 02:47:17 -08:00
Debanjum Singh Solanky
cc77bc4076 Create speech to text API endpoint. Use OpenAI whisper for ASR
- Wrap audio transcription in try/catch and delete audio file after
processing
- Use configured speech to text model, else handle error
2023-11-22 02:47:06 -08:00
Debanjum Singh Solanky
1ca99b6eb0 Add speech to text model configuration to Database 2023-11-22 02:24:31 -08:00
sabaimran
60c23d9e3a Add online search chat director tests 2023-11-21 23:08:36 -08:00
sabaimran
c652a7fd2d Move text_to_entries under the new content folder 2023-11-21 22:25:17 -08:00
sabaimran
1e2af083f0 Rename the data_sources module to content 2023-11-21 22:11:32 -08:00
sabaimran
4cb28aeffb Resolve merge conflicts with master 2023-11-21 22:07:41 -08:00
Debanjum Singh Solanky
4cdfe8fc4f Re-enable Khoj Obsidian plugin for Mobile, as Khoj cloud is available 2023-11-21 16:33:48 -08:00
Debanjum
5d9d50157e Clean Logs, Improve Message Rendering and Make Khoj Trusted Host Configurable (#561)
- Append chat message to chat logs as TextNodes in web, desktop clients

- Simplify Code to Identify Files from Github, Notion on Web, Desktop Client
  - Use file source to find entries from github, notion on web, desktop client
  - Pass file source to clients via text search API response

- Make Django Logs Follow Khoj Log Format, Verbosity
  - Handle image search setup related warning
  - Format Django initializing outputs using Khoj logger format

- Use `KHOJ_HOST` env var to set allowed/trusted domains to host Khoj
2023-11-21 15:14:34 -08:00
sabaimran
458e794d00 Revert PYTHONPATH to what it was before 2023-11-21 14:40:57 -08:00
Debanjum Singh Solanky
9e736d4340 Use KHOJ_DOMAIN for CORS allow_origins list as well
- Default to app.khoj.dev
- Remove unnecesary any_path regex in allow_origins. It only cares
  about host, paths are not set in origin header
2023-11-21 14:02:04 -08:00
sabaimran
5469e81a87 Use full path for the static directory in FastAPI and reflect deeper nesting of the django app 2023-11-21 13:44:45 -08:00
sabaimran
d199c4c35f Resovle merge conflicts with matser 2023-11-21 13:35:56 -08:00
Debanjum Singh Solanky
76d041f633 Use KHOJ_HOST env var to set allowed/trusted domains to host Khoj
Allows hosting Khoj behind other, non "khoj.dev" domains
2023-11-21 13:11:45 -08:00
Debanjum Singh Solanky
90d463c12a Append chat message to chat logs as TextNodes in web, desktop clients 2023-11-21 13:10:50 -08:00
Debanjum Singh Solanky
befcbcdd5d Use file source to find entries from github, notion on web, desktop client
This is a more robust mechanism of identification than via file name
including github or notion domain names
2023-11-21 13:10:50 -08:00
Debanjum Singh Solanky
3f0de45ec6 Pass file source to clients via text search API response
Source of entry stored in DB is now passed to clients for processing
2023-11-21 13:10:50 -08:00
Debanjum Singh Solanky
4aec581306 Handle image search setup related warning
Ideally should rename model_directory to config_directory or some such
but the current image search code will need to be migrated soon. So
changing the variable name and creating a migration script for old
khoj.yml files using model-directory variable isn't worth it

Remove the explicity set of number of threads to use by pytorch. Use
the default used by it.
2023-11-21 13:10:50 -08:00
Debanjum Singh Solanky
b06628ee31 Format Django initializing outputs using Khoj logger format
- Collect STDOUT from the `migrate', `collectstatic' commands and
  output using the Khoj logger format and verbosity settings

- Only show Django `collectstatic' command output in verbose mode

- Fix showing the Initializing Khoj log line by moving it after logger
  level set
2023-11-21 13:10:50 -08:00
Debanjum Singh Solanky
6d9091bef5 Disable isort for now 2023-11-21 13:03:18 -08:00
sabaimran
341abf03ff Handle none for search_type and use equals comparator rather than in for determining Notion type 2023-11-21 12:55:09 -08:00
Debanjum Singh Solanky
19e042037a Run isort with black profile to avoid conflicts between the two 2023-11-21 12:52:07 -08:00
sabaimran
2bb989e9d8 Resolve merge conflicts and fix some import ordering 2023-11-21 12:30:43 -08:00
sabaimran
244b76ffed Add isort for automatic import sorting and skip main.py because it's a drama queen 👑 2023-11-21 12:20:41 -08:00
Debanjum
8a0d92e2d7 Fix Connectivity Check in Obsidian Client (#559) from dtkav/bugfix-local-connectivity-check
Check connection to Khoj server for self-hosted server. This check had regressed during the cloud rearchitecture
2023-11-21 12:05:16 -08:00
sabaimran
0e6f09b241 Merge pull request #562 from khoj-ai/fix/pypi-package-app-not-included
Fix PyPi package app reference issue
2023-11-21 11:54:46 -08:00
sabaimran
61f6b8c0d4 Ignore-check step failed due to unrecognized code. Try using capital letters for indicator 2023-11-21 11:33:43 -08:00
sabaimran
38144a7a69 pull_request path should be src/khoj rather than src/ 2023-11-21 11:33:07 -08:00
Debanjum
e5130fb3f3 Fix ranking search results on Obsidian (#560)
This bug was causing the search results on the Obsidian client to be shown in the reverse order of their actual relevance.

It reversed since entry scores returned by Khoj server are a distance metric since the move to Postgres. So lesser distance is better. Previously higher score was better.
2023-11-21 11:32:47 -08:00
sabaimran
333cb3445c Use colon rather than equals to indicate typing 2023-11-21 11:28:51 -08:00
Debanjum Singh Solanky
645fd96634 Search across all content types from Khoj Obsidian client
Previously it was only searching for PDF and Markdown files. This was
meant to show only content from current vault as results.

But it has not scaled well as other clients also allow syncing PDF and
markdown files now. So remove this content type filter for now.

A proper solution would limit by using file/dir filters on server or
client side.
2023-11-21 11:19:33 -08:00
sabaimran
a1460a5bf9 Set operations to typed empty list in migration file 2023-11-21 11:14:40 -08:00
sabaimran
8932fc0c36 Ignore w004 check to bypass pypi warnings for check-wheel-contents
- PyPi doesn't like to have files that start with numbers, however all of the generated django migration files start with numbers. To accommodate, skip this check.
- Refer to https://pypi.org/project/check-wheel-contents/ for documentation and recommendation
2023-11-21 11:12:50 -08:00
sabaimran
71e794c26f Remove the sys.append line in the main.py file, as it's not required 2023-11-21 10:57:21 -08:00
sabaimran
a474c31e02 Move the django app into the src/khoj folder for better organization and functionality
- Our pypi package currently does not work because the django app and associated database is not included. To remedy this issue, move the app into the src/khoj folder. This has the added benefit of improved organization of the codebase, as all server related code is now in a single folder
- Update associated file paths and system references
2023-11-21 10:56:04 -08:00
Debanjum Singh Solanky
c89bd49973 Fix ranking search results on Obsidian
It's reversed since score of entries is now a distance metric on
Khoj server. So lesser distance is better. Previously higher score was
better
2023-11-21 01:24:59 -08:00
Debanjum
6d8e889917 Improve Self Hosted Khoj Setup (#557)
- c07401cf Fix, Improve chat config via CLI on first run by using defaults
- d61b0dd5 Add Khoj Django app package to sys path to load Django module via pip install
- 4e98acbc Update minimum pydantic version to one with model_validate function
2023-11-20 17:25:53 -08:00
Daniel Grossmann-Kavanagh
f142999bce fix khoj local server usage 2023-11-20 17:07:30 -08:00
Debanjum Singh Solanky
c07401cf76 Fix, Improve chat config via CLI on first run by using defaults
- Fix setting prompt size for online chat
- generally improve chat config via cli by using default chat model,
  prompt size for online and offline chat
2023-11-20 17:01:20 -08:00
sabaimran
b142de15a8 Merge branch 'features/internet-enabled-search' of github.com:khoj-ai/khoj into features/reflective-suggested-questions 2023-11-20 15:56:09 -08:00
sabaimran
a9623ef85a Add requisite imports in order to instantiate offline model in adapters file 2023-11-20 15:27:42 -08:00
sabaimran
a8f13f334f Fix merging issues with base after popping the stash 2023-11-20 15:22:50 -08:00
sabaimran
8fa0b69c67 Resolve merge issue with adapters methods 2023-11-20 15:21:06 -08:00
sabaimran
fee99779bf Add subqueries for internet-connected search results and update client-side code accordingly
- Add a wrapper method to help make direct queries to the LLM and determine any intermediate responses needed for handling the request
2023-11-20 15:19:15 -08:00
Debanjum Singh Solanky
d61b0dd55c Add Khoj Django app package to sys path to load Django module via pip install 2023-11-20 14:55:00 -08:00
Debanjum Singh Solanky
4e98acbca7 Update minimum pydantic version to one with model_validate function 2023-11-20 14:52:37 -08:00
sabaimran
b8e6883a81 Merge branch 'master' of github.com:khoj-ai/khoj into features/internet-enabled-search 2023-11-19 16:20:08 -08:00
sabaimran
237195e20e Make all name-related fields nullable within the GoogleUser 2023-11-19 14:22:32 -08:00
sabaimran
4def8cce36 Merge pull request #541 from asim-shrestha/patch-1
Add test separators
2023-11-19 14:14:34 -08:00
Debanjum
71799add0b Index Parent Headings of Org-Mode Entries to Improve Search Context (#548)
### Overview
The parent hierarchy of org-mode entries can store important context. 
This change updates OrgNode to track parent headings for each org entry and adds the parent outline for each entry to the index

### Details
- Test search uses ancestor headings as context for improved results
- Add ancestor headings of each org-mode entry to their compiled form
- Track ancestor headings for each org-mode entry in org-node parser

Resolves #85
2023-11-19 13:18:19 -08:00
sabaimran
e398a76779 Fix test word filter 2023-11-19 13:14:58 -08:00
sabaimran
33a9304428 Resolve merge conflicts 2023-11-19 12:57:55 -08:00
sabaimran
cfd76b8472 Add open graph links to configure Khoj Docs preview 2023-11-19 12:16:59 -08:00
sabaimran
ef5e9d66c1 Resolve merge conflicts in dependency imports 2023-11-19 11:42:20 -08:00
Debanjum Singh Solanky
c3465d6982 Release Khoj version 1.0.0 2023-11-19 09:50:25 -08:00
Debanjum
736744be3a Update documentation to reflect new multi-user config scenario (#550)
- Update docs to show how to use Khoj Cloud
- Move self-hosting Khoj to separate section
- Add page to setup Desktop app
- Set default URL to Khoj Cloud URL in Obsidian, Emacs clients
2023-11-18 18:22:46 -08:00
Debanjum Singh Solanky
d0e84385f2 Simplify links in Khoj docs to use page_name.md with no prefixes
This allows jumping to page via VSCode IDE and on docs website
2023-11-18 18:17:46 -08:00
Debanjum Singh Solanky
fc65d8a9fe Add documentation page for the Khoj Desktop client 2023-11-18 18:17:35 -08:00
Debanjum Singh Solanky
35b469e488 Simplify setup, features since Khoj cloud in docs
- No Khoj server setup required to start using Khoj from Obsidian, Emacs
- Use tabs for install, upgrade in Emacs with different package
  managers
- Use default subtitles in Khoj Docs
- Deduplicate query filters, remove backend setup instructions in
  plugin pages
- Remove stale Setup demo on Khoj Obsidian plugin docs
2023-11-18 17:25:52 -08:00
Debanjum Singh Solanky
e1bf1f0e86 Update default Khoj server URL to Khoj cloud on Emacs, Obsidian clients 2023-11-18 16:25:45 -08:00
Debanjum Singh Solanky
8775ce730a Use URL fragments to allow jumping to config page sections on Web app 2023-11-18 16:25:45 -08:00
sabaimran
a5613cb08a Merge pull request #554 from khoj-ai/fix/issues-with-prod-chat
Fix misc. issues with chat configuration
2023-11-18 14:45:06 -08:00
sabaimran
f792b1e301 Remove already defined identical function 2023-11-18 14:08:50 -08:00
sabaimran
e2fff5dc47 Don't explicitly use value to get the model type value 2023-11-18 14:01:01 -08:00
sabaimran
a8a25ceac2 Honor user's chat settings when running the extract questions phase
- Add marginally better error handling when GPT gives a messed up respones to the extract questions method
- Remove debug log lines
2023-11-18 13:31:51 -08:00
sabaimran
67156e6aec Add new logs for debugging issues with chat references 2023-11-18 12:10:50 -08:00
sabaimran
5de2ab6098 Change parse_obj calls to use model_validate per new pydantic specification 2023-11-18 12:10:36 -08:00
sabaimran
ebdb423d3e Merge pull request #553 from khoj-ai/features/validation-errors
Update types of base config models for pydantic 2.0
2023-11-18 00:42:56 -08:00
sabaimran
6d249645a6 Fix interpretation of the default search type 2023-11-18 00:04:18 -08:00
sabaimran
f180b2ba94 Resolve mypy errors for various data types 2023-11-17 23:26:15 -08:00
sabaimran
3328a41f08 Update types of base config models for pydantic 2.0 2023-11-17 23:08:52 -08:00
sabaimran
f688529150 Update the default configuration for the AppConfig 2023-11-17 19:26:31 -08:00
sabaimran
11ccb92755 Fix formatting of welcome message to use markdown 2023-11-17 18:55:59 -08:00
Debanjum Singh Solanky
ca87b4ede9 Wrap common API query parameters into shared class to deduplicate code
- Upgrade FastAPI to >= latest version. Required upgrade of FastAPI.
  Earlier version didn't support wrapping common query params in class

- Use per fixture app instead of a global FastAPI app in conftest

- Upgrade minimum required Django version

- Fix no notes chat director test with updated no notes message
  No notes message was updated in commit 118f1143
2023-11-17 18:43:49 -08:00
sabaimran
262f3ccb59 Resolve mypy issues with formatting 2023-11-17 17:11:00 -08:00
sabaimran
a7e00898cb Fix rendering even when no online context references are returned 2023-11-17 16:41:28 -08:00
sabaimran
0fcf234f07 Add support for using serper.dev for online queries
- Use the knowledgeGraph, answerBox, peopleAlsoAsk and organic responses of serper.dev to provide online context for queries made with the /online command
- Add it as an additional tool for doing Google searches
- Render the results appropriately in the chat web window
- Pass appropriate reference data down to the LLM
2023-11-17 16:19:11 -08:00
Debanjum Singh Solanky
33ad9b8e64 Update text search test since indexing ancestor hierarchy added 2023-11-17 15:26:55 -08:00
Debanjum Singh Solanky
55785d50c3 Use title, when present, as root ancestor of entries instead of file path 2023-11-17 15:03:27 -08:00
sabaimran
bfbe273ffd Add some styling to the copy button for programmatic output 2023-11-17 12:18:35 -08:00
sabaimran
9ddf3b58c3 Use the markdown parser for rendering the chat messages in the web interface 2023-11-17 12:14:02 -08:00
sabaimran
a0b12b001a Provide in-line rendering when output matches certain views 2023-11-17 11:04:36 -08:00
sabaimran
ec06d2c446 Move data indexer files into a separate folder under processor. Update assoc UTs 2023-11-16 17:19:55 -08:00
Debanjum Singh Solanky
68ac1e0193 Automate Desktop app builds on new release or push to master branch 2023-11-16 16:09:03 -08:00
sabaimran
45a42faec8 Make adjectives more positive for api token generation 2023-11-16 15:55:35 -08:00
sabaimran
3934633947 Update references to all documentation to reflect instructions for managed service
- By default assume the audience of this website is people looking to understand the featuer offering of Khoj, and then people who are looking to self-host
2023-11-16 15:26:03 -08:00
sabaimran
7688228b9c Update docs to reflect new setup processes and instructions based on rearchitecture
- Most important updates include the depedency requirement to setup Postgres when running/setting Khoj up locally
- Add instructiosn for Docker
- Shift to recommend desktop client and update instructions for how to configure Khoj for user
2023-11-16 12:56:42 -08:00
sabaimran
118f1143ff When user tries using the notes slash command without having any data indexed 2023-11-16 12:52:39 -08:00
sabaimran
e8a13f0813 Add multi-user support to Khoj and use Postgres for backend storage (#549)
- Adds support for multiple users to be connected to the same Khoj instance using their Google login credentials
- Moves storage solution from in-memory json data to a Postgres db. This stores all relevant information, including accounts, embeddings, chat history, server side chat configuration
- Adds the concept of a Khoj server admin for configuring instance-wide settings regarding search model, and chat configuration
- Miscellaneous updates and fixes to the UX, including chat references, colors, and an updated config page
- Adds billing to allow users to subscribe to the cloud service easily
- Adds a separate GitHub action for building the dockerized production (tag `prod`) and dev (tag `dev`) images, separate from the image used for local building. The production image uses `gunicorn` with multiple workers to run the server.
- Updates all clients (Obsidian, Emacs, Desktop) to follow the client/server architecture. The server no longer reads from the file system at all; it only accepts data via the indexer API. In line with that, removes the functionality to configure org, markdown, plaintext, or other file-specific settings in the server. Only leaves GitHub and Notion for server-side configuration.
- Changes license to GNU AGPLv3

Resolves #467 
Resolves #488 
Resolves #303 
Resolves #345 
Resolves #195 
Resolves #280 
Resolves #461 
Closes #259 
Resolves #351
Resolves #301
Resolves #296
2023-11-16 11:48:01 -08:00
sabaimran
1466aef554 Change license to GNU AGPLv3 from GNU GPLv3
- This enforces that upstream consumers of this code should open source their software for any network-distributed services
2023-11-16 11:14:06 -08:00
sabaimran
36d200580b Use a different name for the production-config containers 2023-11-16 10:28:28 -08:00
sabaimran
ba633c4015 Only build the production docker image when pushing to master 2023-11-16 09:24:57 -08:00
Debanjum Singh Solanky
ddb07def0d Test search uses ancestor headings as context for improved results
- Update test data to add deeper outline hierarchy for testing
  hierarchy as context
- Update collateral tests that need count of entries updated, deleted
  asserts to be updated
2023-11-16 03:05:19 -08:00
Debanjum Singh Solanky
74403e3536 Add ancestor headings of each org-mode entry to their compiled form
Resolves #85
2023-11-16 02:54:41 -08:00
Debanjum Singh Solanky
305c25ae1a Track ancestor headings for each org-mode entry in org-node parser 2023-11-16 02:39:14 -08:00
Debanjum
208ddddc6a Make Search Model Configurable on Server (#544)
- Make search model configurable on server
- Update migration script to get search model from `khoj.yml` to Postgres
- Update first run message on Khoj Desktop and Web app landing page
- Other miscellaneous bug fixes
2023-11-16 00:11:58 -08:00
Debanjum Singh Solanky
cc05013715 Update first run message on Web app with Chat models setup instructions
- Link to Django admin panel for user to create Chat Models on their
  Khoj server
- This should only get hit when user is not using Khoj cloud, as Khoj
  cloud would already have Chat models configured
2023-11-15 22:44:24 -08:00
Debanjum Singh Solanky
6c1693b8f4 Update first run message on Desktop app with API token setup instructions
- Open Web app settings in the default browser via link click
- Open Desktop app settings via link click
2023-11-15 22:44:11 -08:00
Debanjum Singh Solanky
922983bd53 Set max cos distance to 0.18. Test search API query with max distance 2023-11-15 20:26:21 -08:00
Debanjum Singh Solanky
18dbad5edb Use Sigmoid to normalize cross-encoder score between 0-1
- While sigmoid normalization isn't required for reranking.
  Normalizing score to distance metrics for both encoder and cross
  encoder scores is useful to reason about them
- Softmax wasn't required as don't need probabilities, sigmoid is good
  enough to get distance metric
2023-11-15 19:31:59 -08:00
sabaimran
0da4db4310 Merge pull request #547 from khoj-ai/features/fix-api-token-generator
Update the return type of the API token generator
2023-11-15 19:23:18 -08:00
sabaimran
ea144de438 Merge with master 2023-11-15 18:34:46 -08:00
sabaimran
6b17aeb32d Resolve merge conflicts in auth.py with remove KhojApiUser import 2023-11-15 17:32:53 -08:00
Debanjum Singh Solanky
348cc0cf0e Use better name for DB adapter func to create user by Google token 2023-11-15 17:31:50 -08:00
Debanjum Singh Solanky
08a057bdd5 Rename SearchModel to SearchModelConfig DB model, Require Cross-Encoder 2023-11-15 17:31:50 -08:00
Debanjum Singh Solanky
0679b2a7bd Use embeddings model store from state in text to entries
Do not need to instantiating it separately. In all other places we're
using the embeddings model store in global state anyway
2023-11-15 17:31:50 -08:00
sabaimran
f88a5867b4 Allow dockerize step to run for prod from PR temporarily 2023-11-15 17:31:50 -08:00
sabaimran
245a9cbf63 Fix return type of the update_or_create method 2023-11-15 17:31:50 -08:00
sabaimran
10be8dfad9 Rename dockerize dev action to be more accurate 2023-11-15 17:31:50 -08:00
sabaimran
70f5d0ed3c Add a dev workflow for GitHub actions, change the production workflow to only kick off when pushed to master 2023-11-15 17:31:50 -08:00
sabaimran
bbae7dd83c Update logic for creating a new user to use aupdate_or_create 2023-11-15 17:31:50 -08:00
sabaimran
154de8c629 Update format for return type of the generate token method 2023-11-15 17:31:12 -08:00
sabaimran
cf74fa4a70 Allow dockerize step to run for prod from PR temporarily 2023-11-15 17:04:48 -08:00
sabaimran
8e62af77b9 Update format for return type of the generate token mehtod 2023-11-15 17:03:01 -08:00
sabaimran
4a487aff23 Fix return type of the update_or_create method 2023-11-15 14:35:42 -08:00
sabaimran
992e54c218 Rename dockerize dev action to b emore accurate 2023-11-15 14:09:28 -08:00
sabaimran
99f5a6082e Add a dev workflow for GitHub actions, change the production workflow to only kick off when pushed to master 2023-11-15 14:07:25 -08:00
sabaimran
b63856ecb4 Update logic for creating a new user to use aupdate_or_create 2023-11-15 12:50:39 -08:00
sabaimran
b8e7488a95 Use a more permissive distance filter for search results from notes 2023-11-15 11:13:47 -08:00
sabaimran
d06b2cf24b Downgrade pyproject.toml to avert depedency conflict 2023-11-15 10:47:54 -08:00
sabaimran
05b7542115 Remove config lock from the state 2023-11-15 10:44:45 -08:00
sabaimran
ecd005cac0 Check if search model is already in DB before creating a new one 2023-11-15 10:41:35 -08:00
Debanjum Singh Solanky
9c6e7bdea2 Upgrade server, desktop app dependencies to resolve CVE bugs 2023-11-15 01:47:53 -08:00
Debanjum Singh Solanky
5a6ab9cc85 Fix failing client tests 2023-11-15 00:17:44 -08:00
Debanjum Singh Solanky
8f200cf53f Remove unused parameter from configure_search_type method 2023-11-14 19:09:35 -08:00
Debanjum Singh Solanky
f8e5e118e1 Only create KhojUser on login if doesn't already exist 2023-11-14 19:09:35 -08:00
Debanjum Singh Solanky
3d8d6145f2 Add search model config from khoj.yml to Postgres DB via migration script 2023-11-14 19:09:35 -08:00
Debanjum Singh Solanky
4af194d74b Make search model configurable on server
- Expose ability to modify search model via Django admin interface
- Previously the bi_encoder and cross_encoder models to use were set
  in code
- Now it's user configurable but with a default config generated by
  default
2023-11-14 19:09:35 -08:00
Debanjum
b734984d6d Fix, Improve Khoj with multi-user, db support for Khoj Cloud Release (#539)
### Overview
Prepare Khoj with multi-user, db support for Khoj Cloud release

### Details
- Add first run experience to configure Khoj via khoj CLI 
- Improve Web app settings page: Move files data into content section card. Move content index update button(s) to content section
- Improve OpenAI chat prompts
  - Push more general information for OpenAI models into system prompt
  - Make it more aware of it's current capabilities
  - Weaken asking follow-up questions
- Rate-limit calls to the chat API
- Add back search results quality threshold
  - Normalize quality score definitions across cross_encoder, encoder to distance metric
- Remove reference to deprecated button
- Await result of the search query
- Fixed Langchain issue by allowing the Docker image to rebuild with a later package version
2023-11-14 16:55:34 -08:00
Debanjum Singh Solanky
e98141f4c3 Subscribe default user to standard plan with a far away renewal date
Self hosted users in anonymous mode have all capabilities unlocked
2023-11-14 16:31:39 -08:00
Debanjum Singh Solanky
9d30fda26d Deduplicate, improve name of prompt templates for GPT4All chat models
- Do not pass unused rerank_results parameter to text_search.query method
2023-11-14 16:31:09 -08:00
Debanjum Singh Solanky
795ec9eb55 Add KHOJ_prefix to server admin credentials environment variables 2023-11-14 16:13:13 -08:00
sabaimran
ee005de662 Rename django files URL to server instead of django 2023-11-14 12:36:38 -08:00
sabaimran
75e5a6b6de Remove all the example mounted volumes as they're no longer required in the new architecture 2023-11-14 12:31:24 -08:00
sabaimran
20ce3d0c78 Update default docker compose configuration with Khoj local mode 2023-11-14 12:21:26 -08:00
sabaimran
8c36079f74 Add a first run experience to intialize the admin user if none exists and setup chat models 2023-11-13 21:07:12 -08:00
Debanjum Singh Solanky
e9adb58c16 Rate limit calls to the /chat API per user, per day/minute 2023-11-13 19:41:46 -08:00
Debanjum Singh Solanky
33a8eb0470 Log when new user is created 2023-11-13 19:37:24 -08:00
sabaimran
603f838115 Block input text field when waiting for chat response 2023-11-11 17:14:37 -08:00
Asim Shrestha
0bfc094e18 Add test separators 2023-11-11 17:08:58 -08:00
Debanjum Singh Solanky
9c321ac070 Fix cross encoder to use softmax to convert it to a distance metric 2023-11-11 16:12:24 -08:00
sabaimran
8a824167cf Merge branch 'fix/imports-and-references' of github.com:khoj-ai/khoj into fix/imports-and-references 2023-11-11 12:59:31 -08:00
sabaimran
fa428932a8 Update URL for downloading the desktop application 2023-11-11 12:59:15 -08:00
Debanjum Singh Solanky
941c7f23a3 Only get text search results above confidence threshold via API
- During the migration, the confidence score stopped being used. It
  was being passed down from API to some point and went unused

- Remove score thresholding for images as image search confidence
  score different from text search model distance score

- Default score threshold of 0.15 is experimentally determined by
  manually looking at search results vs distance for a few queries

- Use distance instead of confidence as metric for search result quality
  Previously we'd moved text search to a distance metric from a
  confidence score.

  Now convert even cross encoder, image search scores to distance metric
  for consistent results sorting
2023-11-11 04:11:33 -08:00
Debanjum Singh Solanky
e44e6df221 Reduce data dumped in console log from web, desktop app 2023-11-11 02:05:07 -08:00
Debanjum Singh Solanky
f044a89d50 Show status in Save, Reinitialize button of config page on web app
- Show non-transient error message in status element if action fails
- On success, just show temporary success message within button
2023-11-11 02:04:58 -08:00
Debanjum Singh Solanky
f17d9da36c Move Configure, Reinitialize buttons into the Content section on Web app
Remove the Results Count button from the web app. It's hanging weirdly
with not much context to its purpose.

Reintroduce it in the Search card when created under the Features section
2023-11-11 02:01:39 -08:00
Debanjum Singh Solanky
325cb0f7fb Show message in Save button of Github, Notion config save in web app
Show the success, failure message only temporarily. Previously it
stuck around after clicking save until page refresh
2023-11-11 02:01:39 -08:00
Debanjum Singh Solanky
b34d4fa741 Save config, update index on save of Github, Notion config in web app
Reduce user confusion by joining config update with index updation for
each content type.

So only a single click required to configure any content type instead
of two clicks on two separate pages
2023-11-11 00:33:49 -08:00
Debanjum Singh Solanky
c4364b9100 Weaken asking follow-up qs and q&a mode in notes prompt to OpenAI models
- Notes prompt doesn't need to be so tuned to question answering. User
could just want to talk about life. The notes need to be used to
response to those, not necessarily only retrieve answers from notes

- System and notes prompts were forcing asking follow-up questions a
  little too much. Reduce strength of follow-up question asking
2023-11-10 23:36:43 -08:00
Debanjum Singh Solanky
cba371678d Stop OpenAI chat from emitting reference notes directly in chat body
The Chat models sometime output reference notes directly in the chat
body in unformatted form, specifically as Notes:\n['. Prevent that.
Reference notes are shown in clean, formatted form anyway
2023-11-10 23:36:43 -08:00
Debanjum Singh Solanky
8585976f37 Revert "Use notes in system prompt, rather than in the user message"
This reverts commit e695b9ab8c.
2023-11-10 23:36:43 -08:00
Debanjum Singh Solanky
b6441683c6 Increase reference text on 1st expansion to 3 lines and 140 characters 2023-11-10 23:36:43 -08:00
sabaimran
55c97241b5 Merge branch 'fix/imports-and-references' of github.com:khoj-ai/khoj into fix/imports-and-references 2023-11-10 22:38:34 -08:00
sabaimran
e2e96f9aa4 Add default settings to let new users be subscribed on trial
- Add the default user to a subscription trial
- Update associated unit tests
2023-11-10 22:38:28 -08:00
Debanjum Singh Solanky
501e7606a0 Increase reference text on 1st expansion to 3 lines and 140 characters 2023-11-10 21:27:04 -08:00
sabaimran
0a950d9382 Fix checker to determine if obsidian client is connected 2023-11-10 19:21:58 -08:00
sabaimran
c736604366 Merge with remote 2023-11-10 17:50:15 -08:00
sabaimran
b0b07bde6c Allow chat reference to expand enough to show the whole reference, rather than constraining the height 2023-11-10 17:49:20 -08:00
sabaimran
14f8c151c8 Fix return type of the generate_chat_response method 2023-11-10 17:48:54 -08:00
Debanjum Singh Solanky
45b8670c25 Fix return type hint for generate_chat_response func 2023-11-10 17:34:19 -08:00
Debanjum Singh Solanky
c9c0ba67c6 Fix chat_client configurations for OpenAI chat director tests 2023-11-10 17:29:23 -08:00
Debanjum Singh Solanky
9b6c5ddba4 Update action row padding in cards on config page of web app 2023-11-10 16:53:25 -08:00
sabaimran
54d4fd0e08 Add chat_model data for logging selected models to telemetry 2023-11-10 16:46:34 -08:00
sabaimran
e695b9ab8c Use notes in system prompt, rather than in the user message 2023-11-10 15:09:33 -08:00
sabaimran
cec932d88a Update prompt so that GPT is more context aware with its capabilities 2023-11-10 14:37:11 -08:00
sabaimran
262a8574d1 Add a test to verify that a user without data sucessfully returns a respones to the /search endpoint 2023-11-10 14:00:58 -08:00
sabaimran
e62788ad79 Await result for determining if user has entries 2023-11-10 13:51:56 -08:00
sabaimran
1a56344f12 Remove the old syncData reference as it no longer exists 2023-11-10 10:10:07 -08:00
Debanjum
a348f1a6ab Reduce Desktop App UX Save, Sync Confusion (#538)
- Show next sync time to make users aware of data sync is automated
- Keep a single Save button to reduce confusion. It does what Save All
  previously did. Intent to manual sync should Save All
- Default to using app.khoj.dev as default Khoj URL to ease Cloud sync setup
- Add detailed chat intro message, mention download desktop app for docs sync
- Only show search in web app nav pane if user has documents indexed
- Hide download desktop app message in web app if synced files exist
- Mark generated profile pic with subscription circle in web app
2023-11-10 00:57:45 -08:00
Debanjum Singh Solanky
39ad1c6ce6 Release Khoj version 0.14.0
Fix Khoj subtitle in manifest of Khoj Obsidian plugin
2023-11-10 00:28:33 -08:00
Debanjum Singh Solanky
745d6bfeed Add detailed intro message, mention download desktop app for docs sync 2023-11-10 00:20:28 -08:00
Debanjum Singh Solanky
6eb7df717c Only show search in web app nav pane if user has documents indexed 2023-11-09 19:14:54 -08:00
Debanjum Singh Solanky
c0789dc57b Use email to get_user_subscription from DB and other DB adapters
- Needing user subscription requires chaining function
- Simplify get_file_sources DB adapter
2023-11-09 19:09:57 -08:00
Debanjum Singh Solanky
841ed95521 Move active user profile halo check into nav pane macro on web app 2023-11-09 18:05:19 -08:00
Debanjum Singh Solanky
ddac693762 Hide download desktop app message in web app if synced files exist 2023-11-09 17:47:00 -08:00
Debanjum Singh Solanky
30a9674f25 Mark generated profile pic with subscription circle in web app 2023-11-09 15:22:38 -08:00
Debanjum Singh Solanky
d6e6ed1cfa Keep single Save button, Show next sync, default to prod Khoj URL in Desktop app
- Make mutable syncing variable not a const
- Show next sync time to make users aware of data sync is automated
- Keep a single Save button to reduce confusion. It does what Save All
  previously did. Intent to manual sync should Save All
- Default to using app.khoj.dev as default Khoj URL to ease setup
2023-11-09 14:04:58 -08:00
Debanjum Singh Solanky
e1f0128576 Change config migration script to update to 0.15.0 version
Next release, 0.14.0 wouldn't contain the migration to Postgres
2023-11-09 12:21:58 -08:00
Debanjum Singh Solanky
17cbbb0b01 Use Consistent Environment Variable for KHOJ_DEBUG 2023-11-09 11:01:28 -08:00
Debanjum Singh Solanky
391db80499 Improve subscribed user profile pictures and nav pane selection
- Add yellow halo around subscribed user profile
- Fix highlighting current page in header nav pane
2023-11-09 00:57:05 -08:00
Debanjum Singh Solanky
605058c72a Allow null user profile picture from Google OAuth in DB
- Fix width of generated profile picture generated for user
- Ignore unused Stripe webhook events
2023-11-09 00:46:59 -08:00
Debanjum
1d3bdf8fdb Create Billing integration. Improve Settings pages on Desktop, Web apps (#537)
### Major
- Expose Billing via Stripe on Khoj Web app for Khoj Cloud subscription
  - Expose card on web app config page to manage subscription to Khoj cloud
  - Create API webhook, endpoints for subscription payments using Stripe
- Put Computer files to index into Card under Content section
  - Show file type icons for each indexed file in config card of web app
  - Enable deleting all indexed desktop files from Khoj via Desktop app
  - Create config page on web app to manage computer files indexed by Khoj
- Track data source (computer, github, notion) of each entry
  - Update content by source via API. Make web client use this API for config
  - Store the data source of each entry in database

### Cleanup
- Set content enabled status on update via config buttons on web app
- Delete deprecated content config pages for local files from web client
- Rename Sync button, Force Sync toggle to Save, Save All buttons

### Fixes
- Prevent Desktop app triggering multiple simultaneous syncs to server
- Upgrade langchain version since adding support for OCR-ing PDFs
- Bubble up content indexing errors to notify user on client apps
2023-11-08 19:55:35 -08:00
Debanjum Singh Solanky
a2609973b8 Disable Subscription if Stripe environment not setup
Deduplicate DJANGO_SECRET_KEY and KHOJ_DJANGO_SECRET_KEY to latter
name as prefixed with KHOJ as KHOJ app specific
2023-11-08 19:39:32 -08:00
Debanjum Singh Solanky
09e1235832 Auto update billing card UI on (re/un-)subscribe click on web app
Previously required a page load to see the updated billing state after
clicking resubscribe or unsubscribe buttons
2023-11-08 18:38:12 -08:00
Debanjum Singh Solanky
8b8bb15866 Keep sync state in memory, initialized to false in Desktop app
Prevent deadlock if desktop app killed in middle of syncing
2023-11-08 18:03:08 -08:00
Debanjum Singh Solanky
c043eb54ae Use typed entry source instead of raw str to map source to conf in api.py 2023-11-08 18:03:08 -08:00
Debanjum Singh Solanky
8178004e6d Move Subscription data into separate table in DB. Merge migrations 2023-11-08 18:03:08 -08:00
Debanjum Singh Solanky
3bb10128ef Move subscription API to separate, independent router 2023-11-08 16:20:27 -08:00
Debanjum Singh Solanky
ec1395d072 Clean, merge subscription update events, API and functions
- Reduce webhook triggers for subscription updates
- Merge subscription update API endpoint, functions for (re/un-)subscribe
2023-11-08 15:55:20 -08:00
Debanjum Singh Solanky
ef5c13f968 Keep user subscription state. Update it when user has unsubscribed 2023-11-08 12:08:36 -08:00
Debanjum Singh Solanky
c52affc6d9 Get Khoj Cloud Subscription URL via environment variable 2023-11-08 12:07:53 -08:00
sabaimran
609d358b1a Use sql datetime comparison for detecting validity of subscription renewal date
- Update the unsubscribe endpoint to use query params
- Use subscription id to process unsubscribe endpoint, rather than the customer id
2023-11-07 19:17:36 -08:00
sabaimran
98cf095b65 Fix bug for rendering chat references in LLM response 2023-11-07 16:44:41 -08:00
sabaimran
0e1cdb6536 Add additional error handling for processing unknown Stripe events and fix typo in STRIPE_SIGNING env variable 2023-11-07 16:43:05 -08:00
sabaimran
08c86927cb Merge branch 'features/multi-user-support-khoj' of github.com:khoj-ai/khoj into fix-improve-config-page-on-desktop-and-web-app 2023-11-07 12:46:49 -08:00
sabaimran
cec54e3a8a Merge pull request #536 from khoj-ai/features/update-chat-ui
Update the chat UI to have richer representation of the references
2023-11-07 12:34:57 -08:00
Debanjum Singh Solanky
f466751f4d Expose card on web app config page to manage subscription to Khoj cloud 2023-11-07 10:21:00 -08:00
Debanjum Singh Solanky
9aaf475c8a Create API webhook, endpoints for subscription payments using Stripe
- Add fields to mark users as subscribed to a specific plan and
  subscription renewal date in DB
- Add ability to unsubscribe a user using their email address
- Expose webhook for stripe to callback confirming payment
2023-11-07 10:20:51 -08:00
Debanjum Singh Solanky
156421d30a Show file type icons for each indexed file in config card of web app 2023-11-07 05:48:44 -08:00
Debanjum Singh Solanky
045c2252d6 Set content enabled status on update via config buttons on web app
Previously hitting configure or disable wouldn't update the state of
the content cards. It needed page refresh to see if the content was
synced correctly.

Now cards automatically get set to new state on hitting disable button
on card or global configure buttons
2023-11-07 05:28:13 -08:00
Debanjum Singh Solanky
7c424e0d5f Enable deleting all indexed desktop files from Khoj via Desktop app 2023-11-07 05:28:13 -08:00
Debanjum Singh Solanky
779fa531a5 Prevent Desktop app triggering multiple simultaneous syncs to server
Lock syncing to server if a sync is already in progress.

While the sync save button gets disabled while sync is in progress,
the background sync job can still trigger a sync in parallel. This
sync lock prevents that
2023-11-07 05:28:13 -08:00
Debanjum Singh Solanky
404d47f1a1 Bubble up content indexing errors to notify user on client apps 2023-11-07 05:28:13 -08:00
Debanjum Singh Solanky
6e957584ac Create config page on web app to manage computer files indexed by Khoj
Remove the table of all files indexed by Khoj. This seems overkill and
doesn't match the UI semantics of the other data sources like Github,
Notion.

Create instead a data source card for computer files with the same
update, disable semantics of the Github and Notion data source cards

Users can disable each data source from its card on the main config page.

They can see/delete individual files indexed from the computer data source
once they click into the computer files data source card on the config page
2023-11-07 04:42:53 -08:00
Debanjum Singh Solanky
d527b644f4 Update content by source via API. Make web client use this API for config 2023-11-07 03:41:19 -08:00
Debanjum Singh Solanky
9ab327a2b6 Store the data source of each entry in database
This will be useful for updating, deleting entries by their data
source. Data source can be one of Computer, Github or Notion for now

Store each file/entries source in database
2023-11-07 02:18:48 -08:00
Debanjum Singh Solanky
c82cd0862a Delete deprecated content config pages for local files from web client
The desktop app now manages syncing local computer files to index
The server only manages "cloud" data source like github and notion.
2023-11-06 23:55:37 -08:00
Debanjum Singh Solanky
9f47fc8e34 Upgrade langchain version since adding support for OCR-ing PDFs 2023-11-06 21:58:33 -08:00
Debanjum Singh Solanky
97cf8339aa Rename Sync button, Force Sync toggle to Save, Save All buttons 2023-11-06 21:57:37 -08:00
Debanjum Singh Solanky
a08b152358 Improve log messages in text_entries and memory leak unit test 2023-11-06 19:27:31 -08:00
sabaimran
6c8689e4ae Update corresponding chat UX in the desktop client as well 2023-11-06 16:18:41 -08:00
sabaimran
e01ecf1419 /s/references/reference to fix bug of jumping references 2023-11-06 16:12:25 -08:00
Debanjum
38f24a037d Improve Indexing Text Entries (#535)
Major
- Ensure search results logic consistent across migration to DB, multi-user
- Manually verified search results for sample queries look the same across migration
 - Flatten indexing code for better indexing progress tracking and code readability

Minor
- a4f407f Test memory leak on MPS device when generating vector embeddings
- ef24485 Improve Khoj with DB setup instructions in the Django app readme (for now)
- f212cc7 Arrange remaining text search tests in arrange, act, assert order
- 022017d Fix text search tests to test updated indexing log messages
2023-11-06 16:01:53 -08:00
sabaimran
270f7b3eb3 Update the chat UI to have richer representation of the references 2023-11-05 15:46:43 -08:00
sabaimran
81a615d7dd Merge pull request #534 from khoj-ai/features/code-config-cleanup
Small fixes and update config UI to manage indexed data
2023-11-05 15:45:45 -08:00
sabaimran
8ebb12820c Add OCR runtime dependencies to prod Dockerfile as well 2023-11-05 15:40:05 -08:00
sabaimran
d697d752c2 Use repeat rather than manually specify auto in grid-template-rows
Co-authored-by: Debanjum <debanjum@gmail.com>
2023-11-05 15:23:42 -08:00
sabaimran
3d6e8d53fe Try adding dependencies for libgl in order to run OCR in github action unit tests 2023-11-05 15:09:40 -08:00
sabaimran
5f1e37fff0 Adjust indentation for css property 2023-11-05 14:33:23 -08:00
sabaimran
fdd727712f Rename test files from x_to_jsonl to x_to_entries 2023-11-05 14:33:07 -08:00
Debanjum Singh Solanky
a4f407f595 Test memory leak on MPS device when generating vector embeddings
Slope threshold of 2.0 determined qualitatively on local Mac device
Minor unused import and clean-up
2023-11-05 03:48:54 -08:00
Debanjum Singh Solanky
ef24485ada Improve Khoj with DB setup instructions in the Django app readme (for now) 2023-11-05 02:04:52 -08:00
Debanjum Singh Solanky
f212cc7174 Arrange remaining text search tests in arrange, act, assert order 2023-11-05 02:04:52 -08:00
Debanjum Singh Solanky
022017dd0f Fix text search tests to test updated indexing log messages 2023-11-05 02:04:52 -08:00
sabaimran
084a8becc5 Fix but to prevent default in chat trigger 2023-11-04 20:13:33 -07:00
Debanjum Singh Solanky
5489e98b9c Do not index org heading entries by default
This is to maintain the previous default behavior
2023-11-04 20:09:25 -07:00
Debanjum Singh Solanky
34b5a86d1d Use SentenceTransformer to disable progress bar when encoding query
The Langchain HuggingFaceEmbeddings wrapper doesn't support disabling
progressbar, not especially for only query but not documents.

This makes the logs noisy with encoding progressbar for each
incremental queries

No features of the Langchain wrapper for SentenceTransformer was
currently being used anyway for now, and we can always switch back to
it if required
2023-11-04 20:09:25 -07:00
Debanjum Singh Solanky
dc9946fc03 Flatten nested loops, improve progress reporting in text_to_jsonl indexer
Flatten the nested loops to improve visibilty into indexing progress

Reduce spurious logs, report the logs at aggregated level and update
the logging description text to improve indexing progress reporting
2023-11-04 20:09:25 -07:00
sabaimran
88eeee3f4b Move try/catch for import one line later 2023-11-04 19:46:47 -07:00
sabaimran
dbaa892665 Flip catching modulenotfound to import error exception 2023-11-04 19:34:10 -07:00
sabaimran
8c3d5a49da Add try/except around image extraction step 2023-11-04 19:27:18 -07:00
sabaimran
fdfab39942 Update the config UI to show all files indexed with option to delete
- Given the separation of the client and server now, the web UI will no longer support configuration of local file paths of data to index
- Expose a way to show all the files that are currently set for indexing, along with an option to delete all or specific files
2023-11-04 19:03:34 -07:00
sabaimran
800bb4f458 Remove references to demo
- The demo setting is no longer necessary for the time being, as we won't have anymore demo instances
2023-11-04 17:17:04 -07:00
sabaimran
b5972e9311 Use OCR to extract image text in PDFs 2023-11-04 17:15:28 -07:00
sabaimran
d1d210605e Merge branch 'features/multi-user-support-khoj' of github.com:khoj-ai/khoj into features/multi-user-support-khoj 2023-11-04 14:29:34 -07:00
sabaimran
3678aa5614 Add tests to validate expected behaviors in the multi-user scenario 2023-11-04 14:29:30 -07:00
Debanjum
12b5ef6540 Improve Theming of Web, Desktop and Obsidian Client App (#532)
- Update theme for Desktop, Web and Obsidian client apps to use lighter colors
- Show splash screen on starting Desktop app
- Make chat the landing page on Desktop and Web clients
- Simplify style of login page on Web app
- Add About page for Desktop app accessible from system tray menu
2023-11-04 12:29:56 -07:00
Debanjum Singh Solanky
8273bf26b7 Fix multi-line chat input and output render on web, desktop clients
- Remove spurious whitespace in chat input box on page load being
  added because text area element was ending on newline
- Do not insert newline in message when send message by hitting enter key
  This would be more evident when send message with cursor in the
  middle of the sentence, as a newline would be inserted at the cursor
  point
- Remove chat message separator tokens from model output. Model
  sometimes starts to output text in it's chat format
2023-11-04 01:09:35 -07:00
Debanjum Singh Solanky
2f1756cc15 Do not use icon for each file, folder to index in desktop app.
Other minor fixes based on PR feedback
2023-11-04 00:13:10 -07:00
Debanjum Singh Solanky
e8f568d79c Make splash screen wider, opaque and fix it's spinner radius
Radius should be such that final spin doesn't extend out of the circle
Opaque background improves contrast for better visual
2023-11-03 23:59:21 -07:00
Debanjum Singh Solanky
3ef05f4803 Use css var for main font color in search, chat page of desktop app 2023-11-03 23:59:21 -07:00
Debanjum Singh Solanky
a19cbde2d7 Add About page for Khoj to Desktop app. Expose it via system tray
- Pass current khoj version from package.json to about page via
  electron IPC between backend js and frontend page
- Update Khoj information in default About screen as well, in case
  it's exposed anywhere else
2023-11-03 23:59:21 -07:00
Debanjum Singh Solanky
a327294ee9 Rename khoj.js to utils.js in web and desktop client apps 2023-11-03 18:13:37 -07:00
Debanjum Singh Solanky
db57eeaefe Console log a welcome message on loading Desktop client 2023-11-03 05:15:41 -07:00
Debanjum Singh Solanky
6fae6fb2a4 Merge branch 'features/multi-user-support-khoj' into improve-client-app-theming 2023-11-03 04:58:41 -07:00
Debanjum Singh Solanky
4cd76311ad Slow down spinning at end of splash sequence. Make animation bigger 2023-11-03 04:28:17 -07:00
Debanjum Singh Solanky
34661c33a2 Show splash screen on starting desktop app 2023-11-03 03:19:08 -07:00
Debanjum Singh Solanky
126d3f4563 Render each file, folder to index row with icon in desktop app
Make the file, folders to index look less like an editable field
2023-11-03 02:48:42 -07:00
Debanjum Singh Solanky
80ae132cad Update Desktop, Obsidian client color theme to lighter yellow
- Update background color to a different shade of white
- Make primary and primary hover colors less intense and more aligned
  with lantern flame shade
- Add water, leaf, flower color variables
2023-11-03 02:48:42 -07:00
sabaimran
fb6ebd19fc Fix refactor bugs, CSRF token issues for use in production (#531)
Fix refactor bugs, CSRF token issues for use in production
* Add flags for samesite settings to enable django admin login
* Include tzdata to dependencies to work around python package issues in linux
* Use DJANGO_DEBUG flag correctly
* Fix naming of entry field when creating EntryDate objects
* Correctly retrieve openai config settings
* Fix datefilter with embeddings name for field
2023-11-02 23:02:38 -07:00
Debanjum Singh Solanky
345856e7be Merge branch 'master' of github.com:khoj-ai/khoj into features/multi-user-support-khoj
Merge changes to use latest GPT4All with GPU, GGUF model support into
khoj multi-user support rearchitecture branch
2023-11-02 22:44:25 -07:00
Debanjum Singh Solanky
041074ccd6 Make chat the landing page for the desktop app
Chat, unlike search, doesn't knowledge base indexing setup.
So you can get started with chat much faster.
2023-11-02 20:42:21 -07:00
Debanjum Singh Solanky
3801105b2a Make chat the landing page for the web app
Chat, unlike search, doesn't knowledge base indexing setup.
So you can get started with chat much faster.
2023-11-02 20:42:21 -07:00
Debanjum Singh Solanky
0d4e7d46c2 Fix color and size of profile picture circle in nav pane 2023-11-02 20:42:21 -07:00
Debanjum Singh Solanky
4fbe8ac6b1 Console log a welcome message on loading web client 2023-11-02 20:42:21 -07:00
Debanjum Singh Solanky
9fc6c97139 Use Khoj standard font family, weight in web client settings page 2023-11-02 20:42:21 -07:00
Debanjum Singh Solanky
b6f07099cd Simplify login page styling on web client
- Center all elements: icon, text and button
- Use khoj icon not logo-text
- Simplify login title text
2023-11-02 20:42:21 -07:00
Debanjum Singh Solanky
7b7f6d3bc8 Update web client theme to a lighter
- Update background color to a different shade of white
- Make primary and primary hover colors less intense and more aligned
  with lantern flame shade
- Add water, leaf, flower color variables
2023-11-02 20:42:21 -07:00
sabaimran
fe860aaf83 Merge branch 'features/multi-user-support-khoj' of github.com:khoj-ai/khoj into features/multi-user-support-khoj 2023-11-02 14:56:01 -07:00
sabaimran
2c9496bcf1 Add additional null checks in the migrate_server_pg script 2023-11-02 14:55:58 -07:00
sabaimran
20df0f5330 Use url_path_for for creating the login page URL in the application 2023-11-02 14:55:14 -07:00
sabaimran
fd11b78552 Fix migration script error when openai not available (#530) 2023-11-02 11:28:08 -07:00
sabaimran
fe6720fa06 [Multi-User Part 8]: Make conversation processor settings server-wide (#529)
- Rather than having each individual user configure their conversation settings, allow the server admin to configure the OpenAI API key or offline model once, and let all the users re-use that code.
- To configure the settings, the admin should go to the `django/admin` page and configure the relevant chat settings. To create an admin, run `python3 src/manage.py createsuperuser` and enter in the details. For simplicity, the email and username should match.
- Remove deprecated/unnecessary endpoints and views for configuring per-user chat settings
2023-11-02 10:43:27 -07:00
Debanjum
0fb81189ca [Multi-User Part 7]: Improve Sign-In UX & Rename DB Models for Readability (#528)
###  New
- Create profile pic drop-down menu in navigation pane
  Put settings page, logout action under drop-down menu

### ⚙️ Fix
- Add Key icon for API keys table on Web Client's settings page

### 🧪 Improve
- Rename `TextEmbeddings` to `TextEntries` for improved readability
- Rename `Db.Models` `Embeddings`, `EmbeddingsAdapter` to `Entry`, `EntryAdapter`
- Show truncated API key for identification & restrict table width for config page responsiveness
2023-11-01 18:05:20 -07:00
Debanjum Singh Solanky
12b3eeae9e Use Khoj fonts on config page of web and desktop apps too
Previously pico.css font-families were being selected for the config
page. This was different from the fonts used by index.html, chat.html

This improves spacing issue of heading further
2023-11-01 17:50:50 -07:00
Debanjum Singh Solanky
022d695309 Switch to narrow view below width of 700px on web client
This makes the dropdown menu align better to the profile picture in
mobile view
2023-11-01 17:49:44 -07:00
Debanjum Singh Solanky
6a0adfbfbb Default to profile picture with Initial if user has no profile picture 2023-11-01 17:49:44 -07:00
Tuan Nguyen
354605e73e Autofocus to chat input when openning chat (#524) 2023-11-01 16:09:45 -07:00
Debanjum Singh Solanky
d92a2d03a7 Rename Files, Classes from X_To_JSONL to more appropriate X_To_Entries
These content processors are converting content into entries in DB
instead of entries in JSONL file
2023-11-01 14:51:33 -07:00
Debanjum Singh Solanky
2ad2055bcb Remove user null check in API controllers that require authentication 2023-11-01 14:38:19 -07:00
Debanjum Singh Solanky
7ac5a4766d Match spacing of navigation header pane in config vs search/chat pages 2023-11-01 14:38:19 -07:00
Debanjum Singh Solanky
2e3a4a6a9b Use Jinja macro to deduplicate navigation header HTML 2023-11-01 14:38:12 -07:00
Debanjum Singh Solanky
c631b61a81 Put colors shared by index, chat html into khoj css global variables 2023-11-01 02:13:24 -07:00
Debanjum Singh Solanky
f585a71744 Put logout, settings under dropdown menu with logged in user's profile picture
- Create dropdown menu. Put settings page, logout action under it
- Make user's profile picture the dropdown menu heading
- Create khoj.js to store shared js across web client
  It currently stores the dropdown menu open, close functionality
- Put shared styling for khoj dropdown menu under khoj.css
2023-11-01 02:13:24 -07:00
Debanjum Singh Solanky
58a7171911 Show truncated API key for identification & restrict table width
- Use a function to generate API Key table row HTML, to dedup logic
- Show delete, copy icon hints on hover
- Reduce length of copied message to not expand table width
- Truncating API key helps keep the API key table width within width
  of smaller width displays
2023-10-31 23:10:26 -07:00
Debanjum Singh Solanky
9cebd7f856 Add emoji icons to Search, Chat, Settings items in nav menu of Web client
Emoji icons have already been added to the Search, Chat and Settings
top navigation menu in the desktop client. This change adds these to
the web client as well
2023-10-31 22:38:44 -07:00
Debanjum Singh Solanky
f77336ba61 Add key icon for API keys table in Web client config page 2023-10-31 19:01:09 -07:00
Debanjum Singh Solanky
87e6b1eab9 Rename TextEmbeddings to TextEntries for improved readability
Improves readability as name has closer match to underlying
constructs
2023-10-31 18:55:59 -07:00
Debanjum Singh Solanky
bcbee05a9e Rename DbModels Embeddings, EmbeddingsAdapter to Entry, EntryAdapter
Improves readability as name has closer match to underlying
constructs

- Entry is any atomic item indexed by Khoj. This can be an org-mode
  entry, a markdown section, a PDF or Notion page etc.

- Embeddings are semantic vectors generated by the search ML model
  that encodes for meaning contained in an entries text.

- An "Entry" contains "Embeddings" vectors but also other metadata
  about the entry like filename etc.
2023-10-31 18:50:54 -07:00
sabaimran
54a387326c [Multi-User Part 6]: Address small bugs and upstream PR comments (#518)
- 08654163cb: Add better parsing for XML files
- f3acfac7fb: Add a try/catch around the dateparser in order to avoid internal server errors in app
- 7d43cd62c0: Chunk embeddings generation in order to avoid large memory load
- e02d751eb3: Addresses comments from PR #498 
- a3f393edb4: Addresses comments from PR #503 
- 66eb078286: Addresses comments from PR #511 
- Address various items in https://github.com/khoj-ai/khoj/issues/527
2023-10-31 17:59:53 -07:00
sabaimran
5f3f6b7c61 [Multi-User Part 5]: Add a production Docker file and use a gunicorn configuration with it (#514)
- Add a productionized setup for the Khoj server using `gunicorn` with multiple workers for handling requests
- Add a new Dockerfile meant for production config at `ghcr.io/khoj-ai/khoj:prod`; the existing Docker config should remain the same
2023-10-26 13:15:31 -07:00
Debanjum
9acc722f7f [Multi-User Part 4]: Authenticate using API Tokens (#513)
###  New
- Use API keys to authenticate from Desktop, Obsidian, Emacs clients
- Create API, UI on web app config page to CRUD API Keys
- Create user API keys table and functions to CRUD them in Database

### 🧪 Improve
- Default to better search model, [gte-small](https://huggingface.co/thenlper/gte-small), to improve search quality
- Only load chat model to GPU if enough space, throw error on load failure
- Show encoding progress, truncate headings to max chars supported
- Add instruction to create db in Django DB setup Readme

### ⚙️ Fix
- Fix error handling when configure offline chat via Web UI
- Do not warn in anon mode about Google OAuth env vars not being set
- Fix path to load static files when server started from project root
2023-10-26 12:33:03 -07:00
sabaimran
4b6ec248a6 [Multi-User Part 3]: Separate chat sesssions based on authenticated users (#511)
- Add a data model which allows us to store Conversations with users. This does a minimal lift over the current setup, where the underlying data is stored in a JSON file. This maintains parity with that configuration.
- There does _seem_ to be some regression in chat quality, which is most likely attributable to search results.

This will help us with #275. It should become much easier to maintain multiple Conversations in a given table in the backend now. We will have to do some thinking on the UI.
2023-10-26 11:37:41 -07:00
sabaimran
a8a82d274a [Multi-User Part 2]: Add login pages and gate access to application behind login wall (#503)
- Make most routes conditional on authentication *if anonymous mode is not enabled*. If anonymous mode is enabled, it scaffolds a default user and uses that for all application interactions.
- Add a basic login page and add routes for redirecting the user if logged in
2023-10-26 10:17:29 -07:00
sabaimran
216acf545f [Multi-User Part 1]: Enable storage of settings for plaintext files based on user account (#498)
- Partition configuration for indexing local data based on user accounts
- Store indexed data in an underlying postgres db using the `pgvector` extension
- Add migrations for all relevant user data and embeddings generation. Very little performance optimization has been done for the lookup time
- Apply filters using SQL queries
- Start removing many server-level configuration settings
- Configure GitHub test actions to run during any PR. Update the test action to run in a containerized environment with a DB.
- Update the Docker image and docker-compose.yml to work with the new application design
2023-10-26 09:42:29 -07:00
Debanjum Singh Solanky
9677eae791 Expose CLI flag to disable using GPU for offline chat model
- Offline chat models outputing gibberish when loaded onto some GPU.
  GPU support with Vulkan in GPT4All seems a bit buggy

- This change mitigates the upstream issue by allowing user to
  manually disable using GPU for offline chat

Closes #516
2023-10-25 17:51:46 -07:00
Debanjum Singh Solanky
5bb14a05a0 Update system requirements in docs for offline chat models 2023-10-22 19:04:23 -07:00
Debanjum Singh Solanky
0f1ebcae18 Upgrade to latest GPT4All. Use Mistral as default offline chat model
GPT4all now supports gguf llama.cpp chat models. Latest
GPT4All (+mistral) performs much at least 3x faster.

On Macbook Pro at ~10s response start time vs 30s-120s earlier.
Mistral is also a better chat model, although it hallucinates more
than llama-2
2023-10-22 19:04:23 -07:00
sabaimran
6dc0df3afb Pin pytorch version to 2.0.1 in order to avoid exit code 139 in Docker container (#512) 2023-10-20 14:10:21 -07:00
sabaimran
963cd165eb Resolve merge conflicts 2023-10-19 14:39:05 -07:00
Simon Butler
e3f8a95784 Update emacs.md (#510)
Minor correction for emacs-lisp in minimal install
2023-10-19 12:28:08 -07:00
Debanjum
d93395ae48 Set >=6Gb RAM required for offline chat
Llama v2 7B with 4bit quantization technically needs ~3.5Gb RAM (7B * 0.5byte), practically a system with 6Gb of RAM should suffice
2023-10-18 12:05:54 -07:00
Debanjum Singh Solanky
8346e1193c Release Khoj version 0.13.0 2023-10-18 03:43:54 -07:00
Debanjum Singh Solanky
6631fc38db Delete plaintext config via API. Catch any offline model loading exception 2023-10-18 03:37:45 -07:00
Debanjum Singh Solanky
53abd1a506 Mark sync completed on desktop client, even when no files to send
Previously Sync spinner on desktop config screen would hang when no
files to send to server & the Sync button had been manually triggered
2023-10-18 01:30:56 -07:00
Debanjum Singh Solanky
71b0012e8c Set offline chat config to default value if unset on server load 2023-10-18 00:59:43 -07:00
Debanjum Singh Solanky
cf1cdc3fe1 Disambiguate input_filter variable names in fs_syncer functions 2023-10-17 23:32:10 -07:00
Debanjum Singh Solanky
e3cd8b4150 Only index files returned by input-filter globs in fs_syncer
Ignore .org, .pdf etc. suffixed directories under `input-filter' from
being evaluated as files.

Explicitly filter results by input-filter globs to only index files,
not directory for each text type

Add test to prevent regression

Closes #448
2023-10-17 23:32:10 -07:00
Debanjum Singh Solanky
51363d280d Do not configure khoj server for pull based indexing from khoj.el
Do not make khoj server pull update index on Obsidian plugin load.
Index is updated on push from plugin instead now/
2023-10-17 21:47:19 -07:00
Debanjum Singh Solanky
d9d133dfb9 Read text files as utf-8, instead of default os locale
On Windows, the default locale isn't utf8. Khoj had regressed to
reading files in OS specified locale encoding, e.g cp1252, cp949 etc.

It now explicitly uses utf8 encoding to read text files for indexing

Resolves #495, resolves #472
2023-10-17 21:47:19 -07:00
Debanjum
3d4576ae38 Fix encoding binary files for sync from the Desktop, Obsidian client (#506)
- Fix encoding binary files like PDFs for sync from Desktop client
- Fix encoding binary files like PDFs for sync from Obsidian client
2023-10-17 15:37:22 -07:00
Debanjum Singh Solanky
c8293998d9 Fix encoding binary files like PDFs for sync from Obsidian client
Use readBinary to read binary files like PDFs instead of read
2023-10-17 15:08:30 -07:00
sabaimran
ba60c869c9 Fix encoding binary files like PDFs for sync from Desktop client
Use readFileSync, Buffer to pass appropriately formatted binary data
2023-10-17 15:08:23 -07:00
Andrew Spott
3d7381446d Changed globbing. Now doesn't clobber a users glob if they want to a… (#496)
* Changed globbing.  Now doesn't clobber a users glob if they want to add it, but will (if just given a directory), add a recursive glob.

Note: python's glob engine doesn't support `{}` globing, a future option is to warn if that is included.

* Fix typo in globformat variable

* Use older glob pattern for plaintext files

---------

Co-authored-by: Saba <narmiabas@gmail.com>
2023-10-17 11:26:06 -07:00
sabaimran
2646c8554d Provide a default value to offline_chat configuration of the conversation processor 2023-10-17 10:35:22 -07:00
Debanjum Singh Solanky
b8976426eb Update offline chat model config schema used by Emacs, Obsidian clients
The server uses a new schema for the conversation config. The Emacs,
Obsidian clients need to use this schema to update the conversation
config
2023-10-17 07:01:35 -07:00
Debanjum
ecc6fbfeb2 Push Files to Index from Emacs, Obsidian & Desktop Clients using Multi-Part Forms (#499)
### Overview
- Add ability to push data to index from the Emacs, Obsidian client
- Switch to standard mechanism of syncing files via HTTP multi-part/form. Previously we were streaming the data as JSON
  - Benefits of new mechanism
    - No manual parsing of files to send or receive on clients or server is required as most have in-built mechanisms to send multi-part/form requests
    - The whole response is not required to be kept in memory to parse content as JSON. As individual files arrive they're automatically pushed to disk to conserve memory if required
    - Binary files don't need to be encoded on client and decoded on server

### Code Details
### Major
- Use multi-part form to receive files to index on server
- Use multi-part form to send files to index on desktop client
- Send files to index on server from the khoj.el emacs client
  - Send content for indexing on server at a regular interval from khoj.el
- Send files to index on server from the khoj obsidian client
- Update tests to test multi-part/form method of pushing files to index

#### Minor
- Put indexer API endpoint under /api path segment
- Explicitly make GET request to /config/data from khoj.el:khoj-server-configure method
- Improve emoji, message on content index updated via logger
- Don't call khoj server on khoj.el load, only once khoj invoked explicitly by user
- Improve indexing of binary files
  - Let fs_syncer pass PDF files directly as binary before indexing
  - Use encoding of each file set in indexer request to read file 
- Add CORS policy to khoj server. Allow requests from khoj apps, obsidian & localhost
- Update indexer API endpoint URL to` index/update` from `indexer/batch`

Resolves #471 #243
2023-10-17 06:05:15 -07:00
Debanjum Singh Solanky
7b1c62ba53 Mark test_get_configured_types_via_api unit test as flaky
It passes locally on running individually but fails when run in
parallel on local or CI
2023-10-17 05:56:00 -07:00
Debanjum Singh Solanky
6a4f1b2188 Add more client, request details in logs by index/update API endpoint 2023-10-17 05:43:29 -07:00
Debanjum Singh Solanky
5efae1ad55 Update indexer API endpoint query params for force, content type
New URL query params, `force' and `t' match name of query parameter in
existing Khoj API endpoints

Update Desktop, Obsidian and Emacs client to call using these new API
query params. Set `client' query param from each client for telemetry
visibility
2023-10-17 04:58:13 -07:00
Debanjum Singh Solanky
84654ffc5d Update indexer API endpoint URL to index/update from indexer/batch
New URL follows action oriented endpoint naming convention used for
other Khoj API endpoints

Update desktop, obsidian and emacs client to call this new API
endpoint
2023-10-17 04:58:13 -07:00
Debanjum Singh Solanky
e347823ff4 Log telemetry for index updates via push to API endpoint 2023-10-17 04:58:13 -07:00
Debanjum Singh Solanky
05be6bd877 Clicking Update Index in Obsidian settings should push files to index
Use the indexer/batch API endpoint to regenerate content index rather
than the previous pull based content indexing API endpoint
2023-10-17 04:58:13 -07:00
Debanjum Singh Solanky
13a3122bf3 Stop configuring server to pull files to index from Obsidian client
Obsidian client now pushes vault files to index instead
2023-10-17 04:58:13 -07:00
Debanjum Singh Solanky
99a2c934a3 Add CORS policy to allow requests from khoj apps, obsidian & localhost
Using fetch from Khoj Obsidian plugin was failing due to cross-origin
request and method: no-cors didn't allow passing x-api-key custom
header. And using Obsidian's request with multi-part/form-data wasn't
possible either.
2023-10-17 04:58:13 -07:00
Debanjum Singh Solanky
541cd59a49 Let fs_syncer pass PDF files directly as binary before indexing
No need to do unneeded base64 encoding/decoding to pass pdf contents
for indexing from fs_syncer to pdf_to_jsonl
2023-10-17 04:58:13 -07:00
Debanjum Singh Solanky
d27dc71dfe Use encoding of each file set in indexer request to read file
Get encoding type from multi-part/form-request body for each file
Read text files as utf-8 and pdfs, images as binary
2023-10-17 04:58:12 -07:00
Debanjum Singh Solanky
8e627a5809 Pass any files to be deleted to indexer API via Khoj Obsidian plugin
- Keep state of previously synced files to identify files to be deleted
- Last synced files stored in settings for persistence of this data
  across Obsidian reboots
2023-10-17 03:34:49 -07:00
Debanjum Singh Solanky
f2e293a149 Push Vault files to index to Khoj server using Khoj Obsidian plugin
Use the multi-part/form-data request to sync Markdown, PDF files in
vault to index on khoj server

Run scheduled job to push updates to value for indexing every 1 hour
2023-10-17 03:05:30 -07:00
Debanjum Singh Solanky
6baaaaf91a Test request body of multi-part form to update content index from khoj.el 2023-10-16 23:54:32 -07:00
Debanjum Singh Solanky
79b3f8273a Make khoj.el send files to be deleted from index to server 2023-10-16 23:53:02 -07:00
Debanjum Singh Solanky
5dc399b32e Document system requirements to run offline chat
Closes #375
2023-10-16 19:39:06 -07:00
Debanjum Singh Solanky
f64fa06e22 Initialize the Khoj Transient menu on first run instead of load
This prevents Khoj from polling the Khoj server until explicitly
invoked via `khoj' entrypoint function.

Previously it'd make a request to the khoj server every time Emacs or
khoj.el was loaded

Closes #243
2023-10-16 19:11:46 -07:00
Debanjum
b4949f7f0b Improve Offline Chat Model Experience (#494)
- Make offline chat model user configurable. Use `filename` of any [GPT4All supported  model](https://github.com/nomic-ai/gpt4all/blob/main/gpt4all-chat/metadata/models.json) like below:
- Run GPT4All Chat Model on GPU, when available via [GPT4All Vulcan support](https://blog.nomic.ai/posts/gpt4all-gpu-inference-with-vulkan)
- Use default Llama 2 supported by GPT4All
- Make `tokenizer` and `max-prompt-size` of chat model user configurable. E.g When using chat models not in [this pre-defined list](https://github.com/khoj-ai/khoj/blob/master/src/khoj/processor/conversation/utils.py) that support larger context window or a different tokenizer.

Closes #406, #418
2023-10-16 17:44:49 -07:00
Debanjum Singh Solanky
644c3b787f Scale no. of chat history messages to use as context with max_prompt_size
Previously lookback turns was set to a static 2. But now that we
support more chat models, their prompt size vary considerably.

Make lookback_turns proportional to max_prompt_size. The truncate_messages
can remove messages if they exceed max_prompt_size later

This lets Khoj pass more of the chat history as context for models
with larger context window
2023-10-16 17:22:28 -07:00
Debanjum Singh Solanky
90e1d9e3d6 Pin gpt4all to 1.0.12 as next version will introduce breaking changes 2023-10-16 10:57:16 -07:00
Debanjum Singh Solanky
1a9023d396 Update Chat Actor test to not incept with prior world knowledge 2023-10-15 17:22:44 -07:00
Debanjum Singh Solanky
df1d74a879 Use max_prompt_size, tokenizer from config for chat model context stuffing 2023-10-15 16:52:53 -07:00
Debanjum Singh Solanky
116595b351 Use chat_model specified in new offline_chat section of config
- Dedupe offline_chat_model variable. Only reference offline chat
  model stored under offline_chat. Delete the previous chat_model
  field under GPT4AllProcessorConfig

- Set offline chat model to use via config/offline_chat API endpoint
2023-10-15 16:37:49 -07:00
Debanjum Singh Solanky
feb4f17e3d Update chat config schema. Make max_prompt, chat tokenizer configurable
This provides flexibility to use non 1st party supported chat models

- Create migration script to update khoj.yml config
  - Put `enable_offline_chat' under new `offline-chat' section
    Referring code needs to be updated to accomodate this change
  - Move `offline_chat_model' to `chat-model' under new `offline-chat' section
  - Put chat `tokenizer` under new `offline-chat' section
  - Put `max_prompt' under existing `conversation' section
    As `max_prompt' size effects both openai and offline chat models
2023-10-15 16:35:11 -07:00
sabaimran
c125995d94 [Multi-User]: Part 0 - Add support for logging in with Google (#487)
* Add concept of user authentication to the request session via GoogleUser
2023-10-14 19:39:13 -07:00
Debanjum Singh Solanky
247e75595c Use AutoTokenizer to support more tokenizers 2023-10-14 16:54:52 -07:00
Saba
ff2dbadc9d Use computed plaintext_content to set file content rather than calling f.read again 2023-10-14 13:28:34 -07:00
Debanjum Singh Solanky
1ad8b150e8 Add default tokenizer, max_prompt as fallback for non-default offline chat models
Pass user configured chat model as argument to use by converse_offline

The proper fix for this would allow users to configure the max_prompt
and tokenizer to use (while supplying default ones, if none provided)
For now, this is a reasonable start.
2023-10-13 22:48:56 -07:00
Debanjum Singh Solanky
56bd69d5af Improve Llama v2 extract questions actor and associated prompt
- Format extract questions prompt format with newlines and whitespaces
- Make llama v2 extract questions prompt consistent

- Remove empty questions extracted by offline extract_questions actor
- Update implicit qs extraction unit test for offline search actor
2023-10-13 22:48:56 -07:00
sabaimran
09bb3686cc Strip the incoming query from the slash conversation command (#500)
* Strip the incoming query from the slash conversation command before passing it to the model or for search
* Return q when content index not loaded
* Remove -n 4 from pytest ini configuration to isolate test failures
2023-10-13 21:11:23 -07:00
Debanjum Singh Solanky
96c0b21285 Sync desktop app package.json with other Khoj clients metadata
- Make `bump_version.sh' script set version for the Khoj desktop app too
- Sync Khoj desktop app authors, license, description and version with
  the other interfaces and server
- Update description in packages metadata to match project subtitle on Github
2023-10-13 20:43:55 -07:00
sabaimran
80fb56b8a5 Sync deksktop app package version with the other releases 2023-10-13 19:23:00 -07:00
Debanjum Singh Solanky
b669aa2395 Clean and fix the content indexing code in the Emacs client
- Pass payloads as unibyte. This was causing the request to fail for
  files with unicode characters
- Suppress messages with file content in on index updates
- Fix rendering response from server on index update API call
- Extract code to populate body of index update HTTP request with files
2023-10-13 18:00:37 -07:00
Debanjum Singh Solanky
bea196aa30 Explicitly make GET request to /config/data from khoj.el:khoj-server-configure method
Previously global state of `url-request-method' would affect the
kind of request made to api/config/data API endpoint as it wasn't
being explicitly being set before calling the API endpoint

This was done with the assumption that the default value of GET for
url-request-method wouldn't change globally

But in some cases, experientially, it can get changed. This was
resulting in khoj.el load failing as POST request was being made
instead which would throw error
2023-10-12 20:58:52 -07:00
Debanjum Singh Solanky
292f0420ad Send content for indexing on server at a regular interval from khoj.el
- Allow indexing frequency to be configurable by user
- Ensure there is only one khoj indexing timer running
2023-10-12 20:58:52 -07:00
Debanjum Singh Solanky
bed3aff059 Update tests to test multi-part/form method of pushing files to index
Instead of using the previous method to push data as json payload of POST request
pass it as files to upload via the multi-part/form to the batch indexer API endpoint
2023-10-12 20:58:52 -07:00
Debanjum Singh Solanky
fc99431754 Send files to index on server from the khoj.el emacs client
- Add elisp variable to set API key to engage with the Khoj server
- Use multi-part form to POST the files to index to the indexer API
  endpoint on the khoj server
2023-10-12 20:58:52 -07:00
Debanjum Singh Solanky
68018ef397 Use multi-part form to send files to index on desktop client
- Add typing for variables in for loop and other minor formatting clean-up
- Assume utf8 encoding for text files and binary for image, pdf files
2023-10-12 20:58:49 -07:00
Debanjum Singh Solanky
7190b3811d Remove all filter terms in user query from defiltered_query
Previously only the the last filter's terms were getting effectively
applied as the `filter.defilter' operation was being done on
`user_query' but was updating the `defiltered_query'
2023-10-12 20:56:17 -07:00
Debanjum Singh Solanky
72f8fde7ef Run pytests in parallel on multiple CPU cores using pytest-xdist for speed 2023-10-12 20:56:17 -07:00
Debanjum Singh Solanky
60e9a61647 Use multi-part form to receive files to index on server
- This uses existing HTTP affordance to process files
  - Better handling of binary file formats as removes need to url encode/decode
  - Less memory utilization than streaming json as files get
    automatically written to disk once memory utilization exceeds preset limits
  - No manual parsing of raw files streams required
2023-10-11 23:58:23 -07:00
Debanjum Singh Solanky
9ba173bc2d Improve emoji, message on content index updated via logger
Use mailbox closed with flag down once content index completed.

Use standard, existing logger messages in new indexer messages, when
files to index sent by clients
2023-10-11 17:12:03 -07:00
Debanjum Singh Solanky
6aa69da3ef Put indexer API endpoint under /api path segment
Update FastAPI app router, desktop app and to use new url path to
batch indexer API endpoint

All api endpoints should exist under /api path segment
2023-10-09 21:35:58 -07:00
Debanjum Singh Solanky
148e8f468f Restrict openai package version below 1.0.0 to avoid breaking changes 2023-10-09 19:30:58 -07:00
Debanjum Singh Solanky
f6f7a62d80 Wait for user to stop typing to trigger search from khoj.el in Emacs
- Improves user experience by aligning idle time with search latency
  to avoid display jitter (to render results) while user is typing

- Makes the idle time configurable

Closes #480
2023-10-06 12:44:45 -07:00
sabaimran
5c4f0d42b7 Return new default config in API endpoint 2023-10-06 12:30:09 -07:00
sabaimran
052b25af0a Update default configuration passed to Khoj clients to circumvent valiation issues 2023-10-06 12:29:15 -07:00
Debanjum Singh Solanky
a85ff941ca Make offline chat model user configurable
Only GPT4All supported Llama v2 models will work given the prompt
structure is not currently configurable
2023-10-04 20:41:14 -07:00
Debanjum Singh Solanky
d1ff812021 Run GPT4All Chat Model on GPU, when available
GPT4All now supports running models on GPU via Vulkan
2023-10-04 18:42:12 -07:00
Debanjum Singh Solanky
13b16a4364 Use default Llama 2 supported by GPT4All
Remove custom logic to download custom Llama 2 model.
This was added as GPT4All didn't support Llama 2 when it was added to Khoj
2023-10-03 19:01:54 -07:00
sabaimran
4a5ed7f06c Update Khoj package version for Electron, Desktop app (#492)
* Address package upgrade for Electron application
* Update package version for Electron desktop application
2023-10-03 12:21:32 -07:00
sabaimran
3f962a55c3 Fix Linux Desktop Application (#491)
* Use separate functions for adding files and folders to configuration for indexing
* Add a loading bar while data is syncing
* Bump the minor version for the application
2023-10-03 11:43:19 -07:00
sabaimran
63b3696af0 Release Khoj version 0.12.3 2023-09-26 22:41:11 -07:00
sabaimran
d2f9bca1cf Fix null ref issue in query method and update logic for determining whether khoj is already configured in obsidian 2023-09-26 22:33:44 -07:00
sabaimran
2f18383349 Release Khoj version 0.12.2 2023-09-26 11:59:47 -07:00
sabaimran
588f35b6e9 Add max prompt size for gpt-3.5-turbo-16k 2023-09-26 10:57:35 -07:00
sabaimran
99f9c3f8e2 Update setup instructions 2023-09-26 09:40:36 -07:00
550 changed files with 60382 additions and 28241 deletions

42
.github/ISSUE_TEMPLATE/bug-report.md vendored Normal file
View File

@@ -0,0 +1,42 @@
---
name: Bug Report
about: Create a bug to help fix something that might not be working correctly
title: "[FIX]"
labels: fix
assignees: ''
---
## Describe the bug
A clear and concise description of what the bug is. Please include what you were expecting to happen vs. what actually happened.
## To Reproduce
Steps to reproduce the behavior:
## Screenshots
If applicable, add screenshots to help explain your problem.
## Platform
- Server:
- [ ] Cloud-Hosted (https://app.khoj.dev)
- [ ] Self-Hosted Docker
- [ ] Self-Hosted Python package
- [ ] Self-Hosted source code
- Client:
- [ ] Obsidian
- [ ] Emacs
- [ ] Desktop app
- [ ] Web browser
- [ ] WhatsApp
- OS:
- [ ] Windows
- [ ] macOS
- [ ] Linux
- [ ] Android
- [ ] iOS
### If self-hosted
- Server Version [e.g. 1.0.1]:
## Additional context
Add any other context about the problem here.

View File

@@ -0,0 +1,11 @@
---
name: Feature Request
about: Suggest an idea to help make Khoj a better tool
title: "[IDEA]"
labels: "upgrade"
assignees: ''
---
## Describe the feature you'd like
A clear and concise description of what you want to happen. Include any relevant links or screenshots or inspiration.

View File

@@ -21,9 +21,9 @@ jobs:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Set up Python 3.9
- name: Set up Python 3.11
uses: actions/setup-python@v1
with: { python-version: 3.9 }
with: { python-version: 3.11 }
- name: ⏬️ Install Dependencies
run: |
python -m pip install --upgrade pip

99
.github/workflows/desktop.yml vendored Normal file
View File

@@ -0,0 +1,99 @@
name: desktop
on:
push:
tags:
- "*"
branches:
- 'master'
paths:
- src/interface/desktop/**
- .github/workflows/desktop.yml
jobs:
build:
name: 🖥️ Build, Release Desktop App
runs-on: ubuntu-latest
env:
TODESKTOP_ACCESS_TOKEN: ${{ secrets.TODESKTOP_ACCESS_TOKEN }}
TODESKTOP_EMAIL: ${{ secrets.TODESKTOP_EMAIL }}
defaults:
run:
shell: bash
working-directory: src/interface/desktop
steps:
- name: ⬇️ Checkout Code
uses: actions/checkout@v3
with:
fetch-depth: 0
- name: ⤵️ Install Node
uses: actions/setup-node@v3
with:
node-version: "lts/*"
- name: ⚙️ Setup Desktop Build
run: |
yarn
npm install -g @todesktop/cli
sed -i "s/\"id\": \"\"/\"id\": \"${{ secrets.TODESKTOP_ID }}\"/g" todesktop.json
- name: ⚙️ Build Desktop App
run: |
npx todesktop build
- name: 📦 Release Desktop App
if: startsWith(github.ref, 'refs/tags/')
run: |
npx todesktop release --latest --force
- name: ⤵️ Get Desktop Apps
if: startsWith(github.ref, 'refs/tags/')
run: |
build_no=`npx todesktop builds --latest | tail -n 1 | awk -F'/' '{print $NF}'`
sleep 900 # wait for 15 minutes for the build to be available
wget https://download.khoj.dev/builds/$build_no/mac/dmg/arm64 -O khoj-${{ github.ref_name }}-arm64.dmg
wget https://download.khoj.dev/builds/$build_no/mac/dmg/x64 -O khoj-${{ github.ref_name }}-x64.dmg
wget https://download.khoj.dev/builds/$build_no/windows/nsis/x64 -O khoj-${{ github.ref_name }}-x64.exe
wget https://download.khoj.dev/builds/$build_no/linux/deb/x64 -O khoj-${{ github.ref_name }}-x64.deb
wget https://download.khoj.dev/builds/$build_no/linux/appImage/x64 -O khoj-${{ github.ref_name }}-x64.AppImage
- name: ⏫ Upload Mac ARM App
if: startsWith(github.ref, 'refs/tags/')
uses: actions/upload-artifact@v3
with:
if-no-files-found: warn
name: khoj-${{ github.ref_name }}-arm64.dmg
path: src/interface/desktop/khoj-${{ github.ref_name }}-arm64.dmg
- name: ⏫ Upload Mac x64 App
if: startsWith(github.ref, 'refs/tags/')
uses: actions/upload-artifact@v3
with:
if-no-files-found: warn
name: khoj-${{ github.ref_name }}-x64.dmg
path: src/interface/desktop/khoj-${{ github.ref_name }}-x64.dmg
- name: ⏫ Upload Windows App
if: startsWith(github.ref, 'refs/tags/')
uses: actions/upload-artifact@v3
with:
if-no-files-found: warn
name: khoj-${{ github.ref_name }}-x64.exe
path: src/interface/desktop/khoj-${{ github.ref_name }}-x64.exe
- name: ⏫ Upload Debian App
if: startsWith(github.ref, 'refs/tags/')
uses: actions/upload-artifact@v3
with:
if-no-files-found: warn
name: khoj-${{ github.ref_name }}-x64.deb
path: src/interface/desktop/khoj-${{ github.ref_name }}-x64.deb
- name: ⏫ Upload Linux App Image
if: startsWith(github.ref, 'refs/tags/')
uses: actions/upload-artifact@v3
with:
if-no-files-found: warn
name: khoj-${{ github.ref_name }}-x64.AppImage
path: src/interface/desktop/khoj-${{ github.ref_name }}-x64.AppImage

View File

@@ -8,23 +8,49 @@ on:
- master
paths:
- src/khoj/**
- config/**
- src/interface/web/**
- pyproject.toml
- Dockerfile
- prod.Dockerfile
- docker-compose.yml
- .github/workflows/dockerize.yml
workflow_dispatch:
inputs:
tag:
description: 'Docker image tag'
default: 'dev'
khoj:
description: 'Build Khoj docker image'
type: boolean
default: true
khoj-cloud:
description: 'Build Khoj cloud docker image'
type: boolean
default: true
env:
DOCKER_IMAGE_TAG: ${{ github.ref == 'refs/heads/master' && 'latest' || github.ref_name }}
# Tag Image with tag name on release
# else with user specified tag (default 'dev') if triggered via workflow
# else with run_id if triggered via a pull request
# else with 'pre' (if push to master)
DOCKER_IMAGE_TAG: ${{ github.ref_type == 'tag' && github.ref_name || github.event_name == 'workflow_dispatch' && github.event.inputs.tag || 'pre' }}
jobs:
build:
name: Build Docker Image, Push to Container Registry
name: Publish Khoj Docker Images
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
image:
- 'local'
- 'cloud'
steps:
- name: Checkout Code
uses: actions/checkout@v3
with:
# Get all history to correctly infer Khoj version using hatch
fetch-depth: 0
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v2
@@ -36,13 +62,39 @@ jobs:
username: ${{ github.repository_owner }}
password: ${{ secrets.PAT }}
- name: Get App Version
id: hatch
run: echo "version=$(pipx run hatch version)" >> $GITHUB_OUTPUT
- name: 🧹 Delete huge unnecessary tools folder
run: rm -rf /opt/hostedtoolcache
- name: 📦 Build and Push Docker Image
uses: docker/build-push-action@v2
if: (matrix.image == 'local' && github.event_name == 'workflow_dispatch') && github.event.inputs.khoj == 'true' || (matrix.image == 'local' && github.event_name == 'push')
with:
context: .
file: Dockerfile
platforms: linux/amd64, linux/arm64
push: true
tags: ghcr.io/${{ github.repository }}:${{ env.DOCKER_IMAGE_TAG }}
tags: |
ghcr.io/${{ github.repository }}:${{ env.DOCKER_IMAGE_TAG }}
${{ github.ref_type == 'tag' && format('ghcr.io/{0}:latest', github.repository) || '' }}
build-args: |
VERSION=${{ steps.hatch.outputs.version }}
PORT=42110
- name: 📦️⛅️ Build and Push Cloud Docker Image
uses: docker/build-push-action@v2
if: (matrix.image == 'cloud' && github.event_name == 'workflow_dispatch') && github.event.inputs.khoj-cloud == 'true' || (matrix.image == 'cloud' && github.event_name == 'push')
with:
context: .
file: prod.Dockerfile
platforms: linux/amd64
push: true
tags: |
ghcr.io/${{ github.repository }}-cloud:${{ env.DOCKER_IMAGE_TAG }}
${{ github.ref_type == 'tag' && format('ghcr.io/{0}-cloud:latest', github.repository) || '' }}
build-args: |
VERSION=${{ steps.hatch.outputs.version }}
PORT=42110

View File

@@ -0,0 +1,46 @@
name: build and deploy github pages for documentation
on:
push:
branches:
- 'master'
permissions:
contents: read
pages: write
id-token: write
jobs:
deploy:
environment:
name: github-pages
url: https://docs.khoj.dev
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v3
# 👇 Build steps
- name: Set up Node.js
uses: actions/setup-node@v3
with:
node-version: 18.x
cache: yarn
cache-dependency-path: documentation/yarn.lock
- name: Install dependencies
run: |
cd documentation
yarn install --frozen-lockfile --non-interactive
- name: Build
run: |
cd documentation
yarn build
# 👆 Build steps
- name: Setup Pages
uses: actions/configure-pages@v3
- name: Upload artifact
uses: actions/upload-pages-artifact@v2
with:
# 👇 Specify build output path
path: documentation/build
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v2

48
.github/workflows/pre-commit.yml vendored Normal file
View File

@@ -0,0 +1,48 @@
name: pre-commit
on:
pull_request:
paths:
- src/**
- tests/**
- config/**
- pyproject.toml
- .pre-commit-config.yml
- .github/workflows/test.yml
push:
branches:
- master
paths:
- src/khoj/**
- tests/**
- config/**
- pyproject.toml
- .pre-commit-config.yml
- .github/workflows/test.yml
jobs:
test:
name: Setup Application and Lint
runs-on: ubuntu-latest
strategy:
fail-fast: false
steps:
- uses: actions/checkout@v3
with:
fetch-depth: 0
- name: Set up Python 3.11
uses: actions/setup-python@v4
with:
python-version: 3.11
- name: ⏬️ Install Dependencies
run: |
sudo apt update && sudo apt install -y libegl1
python -m pip install --upgrade pip
- name: ⬇️ Install Application
run: pip install --no-cache-dir --upgrade .[dev]
- name: 🌡️ Validate Application
run: pre-commit run --hook-stage manual --all

View File

@@ -8,6 +8,7 @@ on:
- 'master'
paths:
- src/khoj/**
- src/interface/web/**
- pyproject.toml
- .github/workflows/pypi.yml
pull_request:
@@ -15,26 +16,41 @@ on:
- 'master'
paths:
- src/khoj/**
- src/interface/web/**
- pyproject.toml
- .github/workflows/pypi.yml
workflow_dispatch:
jobs:
publish:
name: Publish Python Package to PyPI
runs-on: ubuntu-20.04
runs-on: ubuntu-latest
permissions:
id-token: write
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Set up Python 3.10
- name: Set up Python 3.11
uses: actions/setup-python@v4
with:
python-version: '3.10'
python-version: '3.11'
- name: ⬇️ Install Application
- name: ⬇️ Install Server
run: python -m pip install --upgrade pip && pip install --upgrade .
- name: ⬇️ Install Web Client
run: |
yarn install
yarn pypiciexport
working-directory: src/interface/web
- name: 📂 Copy Generated Files
run: |
mkdir -p src/khoj/interface/compiled
cp -r /opt/hostedtoolcache/Python/3.11.9/x64/lib/python3.11/site-packages/khoj/interface/compiled/* src/khoj/interface/compiled/
- name: ⚙️ Build Python Package
run: |
# Setup Environment for Reproducible Builds
@@ -42,23 +58,23 @@ jobs:
export SOURCE_DATE_EPOCH=$(git log -1 --pretty=%ct)
rm -rf dist
# Build PyPi Package
# Build PyPI Package
pipx run build
- name: 🌡️ Validate Python Package
run: |
# Validate PyPi Package
pipx run check-wheel-contents dist/*.whl
pipx run check-wheel-contents dist/*.whl --ignore W004
pipx run twine check dist/*
- name: ⏫ Upload Python Package Artifacts
uses: actions/upload-artifact@v3
uses: actions/upload-artifact@v4
with:
name: khoj-assistant
path: dist/*.whl
name: khoj
path: dist/khoj-*.whl
- name: 📦 Publish Python Package to PyPI
if: startsWith(github.ref, 'refs/tags') || github.ref == 'refs/heads/master'
uses: pypa/gh-action-pypi-publish@v1.6.4
uses: pypa/gh-action-pypi-publish@v1.8.14
with:
password: ${{ secrets.PYPI_API_KEY }}
skip-existing: true

View File

@@ -2,8 +2,6 @@ name: test
on:
pull_request:
branches:
- 'master'
paths:
- src/khoj/**
- tests/**
@@ -13,7 +11,7 @@ on:
- .github/workflows/test.yml
push:
branches:
- 'master'
- master
paths:
- src/khoj/**
- tests/**
@@ -26,13 +24,25 @@ jobs:
test:
name: Run Tests
runs-on: ubuntu-latest
container: ubuntu:jammy
strategy:
fail-fast: false
matrix:
python_version:
- '3.9'
- '3.10'
- '3.11'
- '3.12'
services:
postgres:
image: ankane/pgvector
env:
POSTGRES_PASSWORD: postgres
POSTGRES_USER: postgres
ports:
- 5432:5432
options: --health-cmd pg_isready --health-interval 10s --health-timeout 5s --health-retries 5
steps:
- uses: actions/checkout@v3
with:
@@ -43,17 +53,37 @@ jobs:
with:
python-version: ${{ matrix.python_version }}
- name: ⏬️ Install Dependencies
- name: Install Git
run: |
sudo apt update && sudo apt install -y libegl1
apt update && apt install -y git
- name: ⏬️ Install Dependencies
env:
DEBIAN_FRONTEND: noninteractive
run: |
apt update && apt install -y libegl1 sqlite3 libsqlite3-dev libsqlite3-0 ffmpeg libsm6 libxext6
- name: ⬇️ Install Postgres
env:
DEBIAN_FRONTEND: noninteractive
run : |
apt install -y postgresql postgresql-client && apt install -y postgresql-server-dev-14
- name: ⬇️ Install pip
run: |
apt install -y python3-pip
python -m ensurepip --upgrade
python -m pip install --upgrade pip
- name: ⬇️ Install Application
run: pip install --upgrade .[dev]
- name: 🌡️ Validate Application
run: pre-commit run --hook-stage manual --all
run: sed -i 's/dynamic = \["version"\]/version = "0.0.0"/' pyproject.toml && pip install --upgrade .[dev]
- name: 🧪 Test Application
env:
POSTGRES_HOST: postgres
POSTGRES_PORT: 5432
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
POSTGRES_DB: postgres
run: pytest
timeout-minutes: 10

7
.gitignore vendored
View File

@@ -16,12 +16,15 @@ todesktop.json
# Build artifacts
/src/khoj/interface/web/images
/src/khoj/interface/built/
/src/khoj/interface/compiled/404.html
/build/
/dist/
khoj_assistant.egg-info
/config/khoj*.yml
.pytest_cache
khoj.log
*.log
/src/khoj/static
# Obsidian plugin artifacts
# ---
@@ -34,6 +37,8 @@ src/interface/obsidian/main.js
# Exclude sourcemaps
*.map
# IntelliJ
.idea
# obsidian
data.json

View File

@@ -15,6 +15,13 @@ repos:
- id: check-toml
- id: check-yaml
- repo: https://github.com/pycqa/isort
rev: 5.12.0
hooks:
- id: isort
name: isort (python)
args: ["--profile", "black", "--filter-files"]
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.0.0
hooks:

View File

@@ -1,18 +1,44 @@
# syntax=docker/dockerfile:1
FROM ubuntu:jammy
LABEL org.opencontainers.image.source https://github.com/khoj-ai/khoj
LABEL homepage="https://khoj.dev"
LABEL repository="https://github.com/khoj-ai/khoj"
LABEL org.opencontainers.image.source="https://github.com/khoj-ai/khoj"
# Install System Dependencies
RUN apt update -y && apt -y install python3-pip git
RUN apt update -y && apt -y install python3-pip swig curl
# Install Node.js and Yarn
RUN curl -sL https://deb.nodesource.com/setup_20.x | bash -
RUN apt -y install nodejs
RUN curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | apt-key add -
RUN echo "deb https://dl.yarnpkg.com/debian/ stable main" | tee /etc/apt/sources.list.d/yarn.list
RUN apt update && apt -y install yarn
# Install RapidOCR dependencies
RUN apt -y install libgl1 libgl1-mesa-glx libglib2.0-0
# Install Application
COPY . .
RUN sed -i 's/dynamic = \["version"\]/version = "0.0.0"/' pyproject.toml && \
WORKDIR /app
COPY pyproject.toml .
COPY README.md .
ARG VERSION=0.0.0
RUN sed -i "s/dynamic = \\[\"version\"\\]/version = \"$VERSION\"/" pyproject.toml && \
pip install --no-cache-dir .
# Copy Source Code
COPY . .
# Set the PYTHONPATH environment variable in order for it to find the Django app.
ENV PYTHONPATH=/app/src:$PYTHONPATH
# Go to the directory src/interface/web and export the built Next.js assets
WORKDIR /app/src/interface/web
RUN bash -c "yarn cache clean && yarn install --verbose && yarn ciexport"
WORKDIR /app
# Run the Application
# There are more arguments required for the application to run,
# but these should be passed in through the docker-compose.yml file.
ARG PORT
EXPOSE ${PORT}
ENTRYPOINT ["khoj"]
ENTRYPOINT ["python3", "src/khoj/main.py"]

152
LICENSE
View File

@@ -1,23 +1,21 @@
GNU GENERAL PUBLIC LICENSE
Version 3, 29 June 2007
GNU AFFERO GENERAL PUBLIC LICENSE
Version 3, 19 November 2007
Copyright (C) 2007 Free Software Foundation, Inc. <http://fsf.org/>
Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The GNU General Public License is a free, copyleft license for
software and other kinds of works.
The GNU Affero General Public License is a free, copyleft license for
software and other kinds of works, specifically designed to ensure
cooperation with the community in the case of network server software.
The licenses for most software and other practical works are designed
to take away your freedom to share and change the works. By contrast,
the GNU General Public License is intended to guarantee your freedom to
our General Public Licenses are intended to guarantee your freedom to
share and change all versions of a program--to make sure it remains free
software for all its users. We, the Free Software Foundation, use the
GNU General Public License for most of our software; it applies also to
any other work released this way by its authors. You can apply it to
your programs, too.
software for all its users.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
@@ -26,44 +24,34 @@ them if you wish), that you receive source code or can get it if you
want it, that you can change the software or use pieces of it in new
free programs, and that you know you can do these things.
To protect your rights, we need to prevent others from denying you
these rights or asking you to surrender the rights. Therefore, you have
certain responsibilities if you distribute copies of the software, or if
you modify it: responsibilities to respect the freedom of others.
Developers that use our General Public Licenses protect your rights
with two steps: (1) assert copyright on the software, and (2) offer
you this License which gives you legal permission to copy, distribute
and/or modify the software.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must pass on to the recipients the same
freedoms that you received. You must make sure that they, too, receive
or can get the source code. And you must show them these terms so they
know their rights.
A secondary benefit of defending all users' freedom is that
improvements made in alternate versions of the program, if they
receive widespread use, become available for other developers to
incorporate. Many developers of free software are heartened and
encouraged by the resulting cooperation. However, in the case of
software used on network servers, this result may fail to come about.
The GNU General Public License permits making a modified version and
letting the public access it on a server without ever releasing its
source code to the public.
Developers that use the GNU GPL protect your rights with two steps:
(1) assert copyright on the software, and (2) offer you this License
giving you legal permission to copy, distribute and/or modify it.
The GNU Affero General Public License is designed specifically to
ensure that, in such cases, the modified source code becomes available
to the community. It requires the operator of a network server to
provide the source code of the modified version running there to the
users of that server. Therefore, public use of a modified version, on
a publicly accessible server, gives the public access to the source
code of the modified version.
For the developers' and authors' protection, the GPL clearly explains
that there is no warranty for this free software. For both users' and
authors' sake, the GPL requires that modified versions be marked as
changed, so that their problems will not be attributed erroneously to
authors of previous versions.
Some devices are designed to deny users access to install or run
modified versions of the software inside them, although the manufacturer
can do so. This is fundamentally incompatible with the aim of
protecting users' freedom to change the software. The systematic
pattern of such abuse occurs in the area of products for individuals to
use, which is precisely where it is most unacceptable. Therefore, we
have designed this version of the GPL to prohibit the practice for those
products. If such problems arise substantially in other domains, we
stand ready to extend this provision to those domains in future versions
of the GPL, as needed to protect the freedom of users.
Finally, every program is threatened constantly by software patents.
States should not allow patents to restrict development and use of
software on general-purpose computers, but in those that do, we wish to
avoid the special danger that patents applied to a free program could
make it effectively proprietary. To prevent this, the GPL assures that
patents cannot be used to render the program non-free.
An older license, called the Affero General Public License and
published by Affero, was designed to accomplish similar goals. This is
a different license, not a version of the Affero GPL, but Affero has
released a new version of the Affero GPL which permits relicensing under
this license.
The precise terms and conditions for copying, distribution and
modification follow.
@@ -72,7 +60,7 @@ modification follow.
0. Definitions.
"This License" refers to version 3 of the GNU General Public License.
"This License" refers to version 3 of the GNU Affero General Public License.
"Copyright" also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.
@@ -549,35 +537,45 @@ to collect a royalty for further conveying from those to whom you convey
the Program, the only way you could satisfy both those terms and this
License would be to refrain entirely from conveying the Program.
13. Use with the GNU Affero General Public License.
13. Remote Network Interaction; Use with the GNU General Public License.
Notwithstanding any other provision of this License, if you modify the
Program, your modified version must prominently offer all users
interacting with it remotely through a computer network (if your version
supports such interaction) an opportunity to receive the Corresponding
Source of your version by providing access to the Corresponding Source
from a network server at no charge, through some standard or customary
means of facilitating copying of software. This Corresponding Source
shall include the Corresponding Source for any work covered by version 3
of the GNU General Public License that is incorporated pursuant to the
following paragraph.
Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
under version 3 of the GNU Affero General Public License into a single
under version 3 of the GNU General Public License into a single
combined work, and to convey the resulting work. The terms of this
License will continue to apply to the part which is the covered work,
but the special requirements of the GNU Affero General Public License,
section 13, concerning interaction through a network will apply to the
combination as such.
but the work with which it is combined will remain governed by version
3 of the GNU General Public License.
14. Revised Versions of this License.
The Free Software Foundation may publish revised and/or new versions of
the GNU General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
the GNU Affero General Public License from time to time. Such new versions
will be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the
Program specifies that a certain numbered version of the GNU General
Program specifies that a certain numbered version of the GNU Affero General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation. If the Program does not specify a version number of the
GNU General Public License, you may choose any version ever published
GNU Affero General Public License, you may choose any version ever published
by the Free Software Foundation.
If the Program specifies that a proxy can decide which future
versions of the GNU General Public License can be used, that proxy's
versions of the GNU Affero General Public License can be used, that proxy's
public statement of acceptance of a version permanently authorizes you
to choose that version for the Program.
@@ -619,3 +617,45 @@ Program, unless a warranty or assumption of liability accompanies a
copy of the Program in return for a fee.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
state the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU Affero General Public License as published
by the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.
Also add information on how to contact you by electronic and paper mail.
If your software can interact with users remotely through a computer
network, you should also make sure that it provides a way for users to
get its source. For example, if your program is a web application, its
interface could display a "Source" link that leads users to an archive
of the code. There are many ways you could offer source, and different
solutions will be better for different programs; see section 13 for the
specific requirements.
You should also get your employer (if you work as a programmer) or school,
if any, to sign a "copyright disclaimer" for the program, if necessary.
For more information on this, and how to apply and follow the GNU AGPL, see
<https://www.gnu.org/licenses/>.

View File

@@ -4,40 +4,75 @@
[![test](https://github.com/khoj-ai/khoj/actions/workflows/test.yml/badge.svg)](https://github.com/khoj-ai/khoj/actions/workflows/test.yml)
[![dockerize](https://github.com/khoj-ai/khoj/actions/workflows/dockerize.yml/badge.svg)](https://github.com/khoj-ai/khoj/pkgs/container/khoj)
[![pypi](https://github.com/khoj-ai/khoj/actions/workflows/pypi.yml/badge.svg)](https://pypi.org/project/khoj-assistant/)
[![pypi](https://github.com/khoj-ai/khoj/actions/workflows/pypi.yml/badge.svg)](https://pypi.org/project/khoj/)
![Discord](https://img.shields.io/discord/1112065956647284756?style=plastic&label=discord)
</div>
<div align="center">
<b>An AI personal assistant for your digital brain</b>
<b>The open-source, personal AI for your digital brain</b>
</div>
<br />
<div align="center">
[📜 Read Docs](https://docs.khoj.dev)
[🤖 Read Docs](https://docs.khoj.dev)
<span>&nbsp;&nbsp;&nbsp;&nbsp;</span>
[🌍 Try Khoj Cloud](https://khoj.dev)
[🏮 Khoj Cloud](https://khoj.dev)
<span>&nbsp;&nbsp;&nbsp;&nbsp;</span>
[💬 Get Involved](https://discord.gg/BDgyabRM6e)
<span>&nbsp;&nbsp;&nbsp;&nbsp;</span>
[📚 Read Blog](https://blog.khoj.dev)
</div>
<div align="center">
<div align="left">
***
Khoj is a desktop application to search and chat with your notes, documents and images.<br />
It is an offline-first, open source AI personal assistant accessible from your Emacs, Obsidian or Web browser.<br />
It works with jpeg, markdown, notion, org-mode, pdf files and github repositories.<br />
Khoj is an application that creates always-available, personal AI agents for you to extend your capabilities.
- You can share your notes and documents to extend your digital brain.
- Your AI agents have access to the internet, allowing you to incorporate realtime information.
- Khoj is accessible on Desktop, Emacs, Obsidian, Web and Whatsapp.
- You can share pdf, markdown, org-mode, notion files and github repositories.
- You'll get fast, accurate semantic search on top of your docs.
- Your agents can create deeply personal images and understand your speech.
- Khoj is open-source, self-hostable. Always.
***
</div>
| 🔎 Search | 💬 Chat |
|:---------:|:-------:|
| Quickly retrieve relevant documents using natural language | Get answers and create content from your existing knowledge base |
| Does not need internet | Can be configured to work without internet |
| <img src="https://docs.khoj.dev/assets/khoj_search_on_web.png" width="400px"> | <img src="https://docs.khoj.dev/assets/khoj_chat_on_web.png" width="400px"> |
## See it in action
<img src="https://github.com/khoj-ai/khoj/blob/master/documentation/assets/img/using_khoj_for_studying.gif?raw=true" alt="Khoj Demo">
Go to https://app.khoj.dev to see Khoj live.
## Full feature list
You can see the full feature list [here](https://docs.khoj.dev/category/features).
## Self-Host
To get started with self-hosting Khoj, [read the docs](https://docs.khoj.dev/get-started/setup).
## Contributors
Cheers to our awesome contributors! 🎉
<a href="https://github.com/khoj-ai/khoj/graphs/contributors">
<img src="https://contrib.rocks/image?repo=khoj-ai/khoj" />
</a>
Made with [contrib.rocks](https://contrib.rocks).
### Interested in Contributing?
We are always looking for contributors to help us build new features, improve the project documentation, or fix bugs. If you're interested, please see our [Contributing Guidelines](https://docs.khoj.dev/contributing/development) and check out our [Contributors Project Board](https://github.com/orgs/khoj-ai/projects/4).
## [Sponsors](https://github.com/sponsors/khoj-ai)
Shout out to our brilliant sponsors! 🌈
<a href="http://github.com/beekeeb">
<img src="https://raw.githubusercontent.com/beekeeb/piantor/main/docs/beekeeb.png" width=250/>
</a>

View File

@@ -1,51 +0,0 @@
content-type:
# The /data/folder/ prefix to the folders is here because this is
# the directory to which the local files are copied in the docker-compose.
# If changing, the docker-compose volumes should also be changed to match.
org:
input-files: null
input-filter: ["/data/org/**/*.org"]
compressed-jsonl: "/data/embeddings/notes.jsonl.gz"
embeddings-file: "/data/embeddings/note_embeddings.pt"
index_heading_entries: false
markdown:
input-files: null
input-filter: ["/data/markdown/**/*.markdown"]
compressed-jsonl: "/data/embeddings/markdown.jsonl.gz"
embeddings-file: "/data/embeddings/markdown_embeddings.pt"
pdf:
input-files: null
input-filter: ["/data/pdf/**/*.pdf"]
compressed-jsonl: "/data/embeddings/pdf.jsonl.gz"
embeddings-file: "/data/embeddings/pdf_embeddings.pt"
image:
input-directories: ["/data/images/"]
embeddings-file: "/data/embeddings/image_embeddings.pt"
batch-size: 50
use-xmp-metadata: false
notion: null
github: null
plugins: null
search-type:
symmetric: null
asymmetric:
encoder: "sentence-transformers/multi-qa-MiniLM-L6-cos-v1"
cross-encoder: "cross-encoder/ms-marco-MiniLM-L-6-v2"
model_directory: "/data/models/asymmetric"
image:
encoder: "sentence-transformers/clip-ViT-B-32"
model_directory: "/data/models/image_encoder"
processor:
conversation:
conversation-logfile: "/data/embeddings/conversation_logs.json"
enable-offline-chat: false
openai: null
app:
should_log_telemetry: true

View File

@@ -1,57 +0,0 @@
content-type:
org:
input-files: # ["/path/to/org-file.org"] REQUIRED IF input-filter IS NOT SET OR
input-filter: # ["/path/to/org/*.org"] REQUIRED IF input-files IS NOT SET
compressed-jsonl: "~/.khoj/content/org/org.jsonl.gz"
embeddings-file: "~/.khoj/content/org/org_embeddings.pt"
index_heading_entries: false # Set to true to index entries with empty body
markdown:
input-files: # ["/path/to/markdown-file.md"] REQUIRED IF input-filter IS NOT SET OR
input-filter: # ["/path/to/markdown/*.md"] REQUIRED IF input-files IS NOT SET
compressed-jsonl: "~/.khoj/content/markdown/markdown.jsonl.gz"
embeddings-file: "~/.khoj/content/markdown/markdown_embeddings.pt"
ledger:
input-files: # ["/path/to/ledger-file.beancount"] REQUIRED IF input-filter is not set OR
input-filter: # ["/path/to/ledger/*.beancount"] REQUIRED IF input-files is not set
compressed-jsonl: "~/.khoj/content/ledger/ledger.jsonl.gz"
embeddings-file: "~/.khoj/content/ledger/ledger_embeddings.pt"
image:
input-directories: # ["/path/to/images/"] REQUIRED IF input-filter IS NOT SET OR
input-filter: # ["/path/to/images/*.jpg"] REQUIRED IF input-directories IS NOT SET
embeddings-file: "~/.khoj/content/image/image_embeddings.pt"
batch-size: 50
use-xmp-metadata: false
music:
input-files: # ["/path/to/music-file.org"] REQUIRED IF input-filter IS NOT SET OR
input-filter: # ["/path/to/music/*.org"] REQUIRED IF input-files IS NOT SET
compressed-jsonl: "~/.khoj/content/music/music.jsonl.gz"
embeddings-file: "~/.khoj/content/music/music_embeddings.pt"
search-type:
symmetric:
encoder: "sentence-transformers/all-MiniLM-L6-v2"
cross-encoder: "cross-encoder/ms-marco-MiniLM-L-6-v2"
encoder-type: sentence_transformers.SentenceTransformer
model_directory: "~/.khoj/search/symmetric/"
asymmetric:
encoder: "sentence-transformers/multi-qa-MiniLM-L6-cos-v1"
cross-encoder: "cross-encoder/ms-marco-MiniLM-L-6-v2"
encoder-type: sentence_transformers.SentenceTransformer
model_directory: "~/.khoj/search/asymmetric/"
image:
encoder: "sentence-transformers/clip-ViT-B-32"
encoder-type: sentence_transformers.SentenceTransformer
model_directory: "~/.khoj/search/image/"
processor:
conversation:
openai-api-key: # "YOUR_OPENAI_API_KEY"
model: "text-davinci-003"
chat-model: "gpt-3.5-turbo"
conversation-logfile: "~/.khoj/processor/conversation/conversation_logs.json"

View File

@@ -1,7 +1,28 @@
version: "3.9"
services:
database:
image: ankane/pgvector
ports:
- "5432:5432"
environment:
POSTGRES_USER: postgres
POSTGRES_PASSWORD: postgres
POSTGRES_DB: postgres
volumes:
- khoj_db:/var/lib/postgresql/data/
healthcheck:
test: ["CMD-SHELL", "pg_isready -U postgres"]
interval: 30s
timeout: 10s
retries: 5
server:
depends_on:
database:
condition: service_healthy
# Use the following line to use the latest version of khoj. Otherwise, it will build from source.
image: ghcr.io/khoj-ai/khoj:latest
# Uncomment the following line to build from source. This will take a few minutes. Comment the next two lines out if you want to use the offiicial image.
# build:
# context: .
ports:
# If changing the local port (left hand side), no other changes required.
# If changing the remote port (right hand side),
@@ -10,24 +31,27 @@ services:
- "42110:42110"
working_dir: /app
volumes:
- .:/app
# These mounted volumes hold the raw data that should be indexed for search.
# The path in your local directory (left hand side)
# points to the files you want to index.
# The path of the mounted directory (right hand side),
# must match the path prefix in your config file.
- ./tests/data/org/:/data/org/
- ./tests/data/images/:/data/images/
- ./tests/data/markdown/:/data/markdown/
- ./tests/data/pdf/:/data/pdf/
# Embeddings and models are populated after the first run
# You can set these volumes to point to empty directories on host
- ./tests/data/embeddings/:/root/.khoj/content/
- ./tests/data/models/:/root/.khoj/search/
- khoj_config:/root/.khoj/
- khoj_models:/root/.cache/torch/sentence_transformers
# Use 0.0.0.0 to explicitly set the host ip for the service on the container. https://pythonspeed.com/articles/docker-connection-refused/
command: --host="0.0.0.0" --port=42110 -vv
environment:
- POSTGRES_DB=postgres
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=postgres
- POSTGRES_HOST=database
- POSTGRES_PORT=5432
- KHOJ_DJANGO_SECRET_KEY=secret
- KHOJ_DEBUG=False
- KHOJ_ADMIN_EMAIL=username@example.com
- KHOJ_ADMIN_PASSWORD=password
# Uncomment the following lines to make your instance publicly accessible.
# Replace the domain with your domain. Proceed with caution, especially if you are using anonymous mode.
# - KHOJ_NO_HTTPS=True
# - KHOJ_DOMAIN=192.168.0.104
command: --host="0.0.0.0" --port=42110 -vv --anonymous-mode
volumes:
khoj_config:
khoj_db:
khoj_models:

View File

@@ -1 +0,0 @@
docs.khoj.dev

View File

@@ -1,54 +0,0 @@
<p align="center"><img src="./assets/khoj-logo-sideways-500.png" width="200" alt="Khoj Logo"></p>
<div align="center">
[![test](https://github.com/khoj-ai/khoj/actions/workflows/test.yml/badge.svg)](https://github.com/khoj-ai/khoj/actions/workflows/test.yml)
[![dockerize](https://github.com/khoj-ai/khoj/actions/workflows/dockerize.yml/badge.svg)](https://github.com/khoj-ai/khoj/pkgs/container/khoj)
[![pypi](https://github.com/khoj-ai/khoj/actions/workflows/pypi.yml/badge.svg)](https://pypi.org/project/khoj-assistant/)
</div>
<div align="center">
<b>An AI personal assistant for your digital brain</b>
</div>
<div align="center">
[📜 Explore Code](https://github.com/khoj-ai/khoj)
<span>&nbsp;&nbsp;&nbsp;&nbsp;</span>
[🌍 Try Khoj Cloud](https://khoj.dev)
<span>&nbsp;&nbsp;&nbsp;&nbsp;</span>
[💬 Get Involved](https://discord.gg/BDgyabRM6e)
</div>
## Introduction
Welcome to the Khoj Docs! This is the best place to [get started](./setup.md) with Khoj.
- Khoj is a desktop application to [search](./search.md) and [chat](./chat.md) with your notes, documents and images
- It is an offline-first, open source AI personal assistant accessible from your [Emacs](./emacs.md), [Obsidian](./obsidian.md) or [Web browser](./web.md)
- It works with jpeg, markdown, [notion](./notion_integration.md) org-mode, pdf files and [github repositories](./github_integration.md)
- If you have more questions, check out the [FAQ](https://faq.khoj.dev/) - it's a live Khoj instance indexing our Github repository!
## Quickstart
[Click here](./setup.md) for full setup instructions
```shell
pip install khoj-assistant && khoj
```
## Overview
<img src="https://docs.khoj.dev/assets/khoj_search_on_web.png" width="400px">
<span>&nbsp;&nbsp;</span>
<img src="https://docs.khoj.dev/assets/khoj_chat_on_web.png" width="400px">
#### [Search](./search.md)
- **Local**: Your personal data stays local. All search and indexing is done on your machine.
- **Incremental**: Incremental search for a fast, search-as-you-type experience
#### [Chat](./chat.md)
- **Faster answers**: Find answers faster, smoother than search. No need to manually scan through your notes to find answers.
- **Iterative discovery**: Iteratively explore and (re-)discover your notes
- **Assisted creativity**: Smoothly weave across answers retrieval and content generation
- **Online or Offline**: Choose online or offline chat depending on your requirements

View File

@@ -1,14 +0,0 @@
<!-- _coverpage.md -->
![logo](./assets/khoj-logo-sideways-200.png)
> An open source, AI personal assistant for your notes
- Lightning fast search
- Multi-turn chat
- Keeps you in control of your data
[GitHub](https://github.com/khoj-ai/khoj)
[Get Started](#khoj)
![color](#f9f5de)

View File

@@ -1,22 +0,0 @@
- Get Started
- [Overview](README.md)
- [Install](setup.md)
- [Demos](demos.md)
- Use
- [Features](features.md)
- [Chat](chat.md)
- [Search](search.md)
- Interfaces
- [Obsidian](obsidian.md)
- [Emacs](emacs.md)
- [Web](web.md)
- Online Data Sources
- [Github](github_integration.md)
- [Notion](notion_integration.md)
- Miscellaneous
- [Telemetry](telemetry.md)
- [Advanced](advanced.md)
- [Performance](performance.md)
- [Credits](credits.md)
- Contributing
- [Development](development.md)

View File

@@ -1,81 +0,0 @@
## Advanced Usage
### Search across Different Languages
To search for notes in multiple, different languages, you can use a [multi-lingual model](https://www.sbert.net/docs/pretrained_models.html#multi-lingual-models).<br />
For example, the [paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) supports [50+ languages](https://www.sbert.net/docs/pretrained_models.html#:~:text=we%20used%20the%20following%2050%2B%20languages), has good search quality and speed. To use it:
1. Manually update `search-type > asymmetric > encoder` to `paraphrase-multilingual-MiniLM-L12-v2` in your `~/.khoj/khoj.yml` file for now. See diff of `khoj.yml` below for illustration:
```diff
asymmetric:
- encoder: sentence-transformers/multi-qa-MiniLM-L6-cos-v1
+ encoder: paraphrase-multilingual-MiniLM-L12-v2
cross-encoder: cross-encoder/ms-marco-MiniLM-L-6-v2
model_directory: "~/.khoj/search/asymmetric/"
```
2. Regenerate your content index. For example, by opening [\<khoj-url\>/api/update?t=force](http://localhost:42110/api/update?t=force)
### Access Khoj on Mobile
1. [Setup Khoj](/#/setup) on your personal server. This can be any always-on machine, i.e an old computer, RaspberryPi(?) etc
2. [Install](https://tailscale.com/kb/installation/) [Tailscale](tailscale.com/) on your personal server and phone
3. Open the Khoj web interface of the server from your phone browser.<br /> It should be `http://tailscale-ip-of-server:42110` or `http://name-of-server:42110` if you've setup [MagicDNS](https://tailscale.com/kb/1081/magicdns/)
4. Click the [Add to Homescreen](https://developer.mozilla.org/en-US/docs/Web/Progressive_web_apps/Add_to_home_screen) button
5. Enjoy exploring your notes, documents and images from your phone!
![](./assets/khoj_pwa_android.png?)
### Use OpenAI Models for Search
#### Setup
1. Set `encoder-type`, `encoder` and `model-directory` under `asymmetric` and/or `symmetric` `search-type` in your `khoj.yml` (at `~/.khoj/khoj.yml`):
```diff
asymmetric:
- encoder: "sentence-transformers/multi-qa-MiniLM-L6-cos-v1"
+ encoder: text-embedding-ada-002
+ encoder-type: khoj.utils.models.OpenAI
cross-encoder: "cross-encoder/ms-marco-MiniLM-L-6-v2"
- encoder-type: sentence_transformers.SentenceTransformer
- model_directory: "~/.khoj/search/asymmetric/"
+ model-directory: null
```
2. [Setup your OpenAI API key in Khoj](/#/chat?id=setup)
3. Restart Khoj server to generate embeddings. It will take longer than with the offline search models.
#### Warnings
This configuration *uses an online model*
- It will **send all notes to OpenAI** to generate embeddings
- **All queries will be sent to OpenAI** when you search with Khoj
- You will be **charged by OpenAI** based on the total tokens processed
- It *requires an active internet connection* to search and index
### Bootstrap Khoj Search for Offline Usage later
You can bootstrap Khoj pre-emptively to run on machines that do not have internet access. An example use-case would be to run Khoj on an air-gapped machine.
Note: *Only search can currently run in fully offline mode, not chat.*
- With Internet
1. Manually download the [asymmetric text](https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1), [symmetric text](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) and [image search](https://huggingface.co/sentence-transformers/clip-ViT-B-32) models from HuggingFace
2. Pip install khoj (and dependencies) in an associated virtualenv. E.g `python -m venv .venv && source .venv/bin/activate && pip install khoj-assistant`
- Without Internet
1. Copy each of the search models into their respective folders, `asymmetric`, `symmetric` and `image` under the `~/.khoj/search/` directory on the air-gapped machine
2. Copy the khoj virtual environment directory onto the air-gapped machine, activate the environment and start and khoj as normal. E.g `source .venv/bin/activate && khoj`
### Query Filters
Use structured query syntax to filter entries from your knowledge based used by search results or chat responses.
- **Word Filter**: Get entries that include/exclude a specified term
- Entries that contain term_to_include: `+"term_to_include"`
- Entries that contain term_to_exclude: `-"term_to_exclude"`
- **Date Filter**: Get entries containing dates in YYYY-MM-DD format from specified date (range)
- Entries from April 1st 1984: `dt:"1984-04-01"`
- Entries after March 31st 1984: `dt>="1984-04-01"`
- Entries before April 2nd 1984 : `dt<="1984-04-01"`
- **File Filter**: Get entries from a specified file
- Entries from incoming.org file: `file:"incoming.org"`
- Combined Example
- `what is the meaning of life? file:"1984.org" dt>="1984-01-01" dt<="1985-01-01" -"big" -"brother"`
- Adds all filters to the natural language query. It should return entries
- from the file *1984.org*
- containing dates from the year *1984*
- excluding words *"big"* and *"brother"*
- that best match the natural language query *"what is the meaning of life?"*

Binary file not shown.

Before

Width:  |  Height:  |  Size: 200 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 358 KiB

View File

@@ -1,47 +0,0 @@
### Khoj Chat
#### Overview
- Creates a personal assistant for you to inquire and engage with your notes
- You can choose to use Online or Offline Chat depending on your requirements
- Supports multi-turn conversations with the relevant notes for context
- Shows reference notes used to generate a response
### Setup
#### Offline Chat
Offline chat works without internet but it is slower, lower quality and more compute intensive.
!> **Warning**: This will download a 3Gb+ Llama v2 chat model which can take some time
- Open your [Khoj settings](http://localhost:42110/config/), click *Enable* on the Offline Chat card
![Configure offline chat](https://user-images.githubusercontent.com/6413477/257021364-8a2029f5-dc21-4de8-9af9-9ba6100d695c.mp4 ':include :type=mp4')
#### Online Chat
Online chat requires internet to use ChatGPT but is faster, higher quality and less compute intensive.
!> **Warning**: This will enable Khoj to send your chat queries and notes to OpenAI for processing
1. Get your [OpenAI API Key](https://platform.openai.com/account/api-keys)
2. Open your [Khoj Online Chat settings](http://localhost:42110/config/processor/conversation), add your OpenAI API key, and click *Save*. Then go to your [Khoj settings](http://localhost:42110/config) and click `Configure`. This will refresh Khoj with your OpenAI API key.
![Configure online chat](https://user-images.githubusercontent.com/6413477/256998908-ac26e55e-13a2-45fb-9348-3b90a62f7687.mp4 ':include :type=mp4')
### Use
1. Open Khoj Chat
- **On Web**: Open [/chat](http://localhost:42110/chat) in your web browser
- **On Obsidian**: Search for *Khoj: Chat* in the [Command Palette](https://help.obsidian.md/Plugins/Command+palette)
- **On Emacs**: Run `M-x khoj <user-query>`
2. Enter your queries to chat with Khoj. Use [slash commands](#commands) and [query filters](./advanced.md#query-filters) to change what Khoj uses to respond
![](./assets/khoj_chat_on_web.png ':size=400px')
#### Details
1. Your query is used to retrieve the most relevant notes, if any, using Khoj search
2. These notes, the last few messages and associated metadata is passed to the enabled chat model along with your query to generate a response
#### Commands
Slash commands allows you to change what Khoj uses to respond to your query
- **/notes**: Limit chat to only respond using your notes, not just Khoj's general world knowledge as reference
- **/general**: Limit chat to only respond using Khoj's general world knowledge, not using your notes as reference
- **/default**: Allow chat to respond using your notes or it's general knowledge as reference. It's the default behavior when no slash command is used
- **/help**: Use /help to get all available commands and general information about Khoj

View File

@@ -1,8 +0,0 @@
## Credits
- [Multi-QA MiniLM Model](https://huggingface.co/sentence-transformers/multi-qa-MiniLM-L6-cos-v1), [All MiniLM Model](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) for Text Search. See [SBert Documentation](https://www.sbert.net/examples/applications/retrieve_rerank/README.html)
- [OpenAI CLIP Model](https://github.com/openai/CLIP) for Image Search. See [SBert Documentation](https://www.sbert.net/examples/applications/image-search/README.html)
- Charles Cave for [OrgNode Parser](http://members.optusnet.com.au/~charles57/GTD/orgnode.html)
- [Org.js](https://mooz.github.io/org-js/) to render Org-mode results on the Web interface
- [Markdown-it](https://github.com/markdown-it/markdown-it) to render Markdown results on the Web interface
- [GPT4All](https://github.com/nomic-ai/gpt4all) to chat with local LLM

View File

@@ -1,51 +0,0 @@
## Demos
### Screenshots
#### Web
![](./assets/khoj_search_on_web.png ':size=300px')
![](./assets/khoj_chat_on_web.png ':size=300px')
#### Obsidian
![](./assets/khoj_search_on_obsidian.png ':size=300px')
![](./assets/khoj_chat_on_obsidian.png ':size=300px')
#### Emacs
![](./assets/khoj_search_on_emacs.png ':size=300px')
![](./assets/khoj_chat_on_emacs.png ':size=400px')
### Videos
#### Khoj in Obsidian
[KhojObsidian](https://github-production-user-asset-6210df.s3.amazonaws.com/6413477/240061700-3e33d8ea-25bb-46c8-a3bf-c92f78d0f56b.mp4 ':include :type=mp4')
##### Installation
1. Install Khoj via `pip` and start Khoj backend in a terminal (Run `khoj`)
```bash
python -m pip install khoj-assistant
khoj
```
2. Install Khoj plugin via Community Plugins settings pane on Obsidian app
- Check the new Khoj plugin settings
- Let Khoj backend index the markdown, pdf, Github markdown files in the current Vault
- Open Khoj plugin on Obsidian via Search button on Left Pane
- Search \"*Announce plugin to folks*\" in the [Obsidian Plugin docs](https://marcus.se.net/obsidian-plugin-docs/)
- Jump to the [search result](https://marcus.se.net/obsidian-plugin-docs/publishing/submit-your-plugin)
#### Khoj in Emacs, Browser
[KhojEmacs](https://user-images.githubusercontent.com/6413477/184735169-92c78bf1-d827-4663-9087-a1ea194b8f4b.mp4 ':include :type=mp4')
##### Installation
- Install Khoj via pip
- Start Khoj app
- Add this readme and [khoj.el readme](https://github.com/khoj-ai/khoj/tree/master/src/interface/emacs) as org-mode for Khoj to index
- Search \"*Setup editor*\" on the Web and Emacs. Re-rank the results for better accuracy
- Top result is what we are looking for, the [section to Install Khoj.el on Emacs](https://github.com/khoj-ai/khoj/tree/master/src/interface/emacs#2-Install-Khojel)
##### Analysis
- The results do not have any words used in the query
- *Based on the top result it seems the re-ranking model understands that Emacs is an editor?*
- The results incrementally update as the query is entered
- The results are re-ranked, for better accuracy, once user hits enter

View File

@@ -1,32 +0,0 @@
# Installing the Desktop Application [Deprecated -- for 0.11.4 and below]
We have beta desktop images available for download with new releases. This is recommended if you don't want to bother with the command line. Download the latest release from [here](https://github.com/khoj-ai/khoj/releases). You can find the latest release under the `Assets` section.
## MacOS
1. Download the latest release from [here](https://github.com/khoj-ai/khoj/releases).
- If your Mac uses one of the Silicon chips, then download the `Khoj_<version>_arm64.dmg` file. Otherwise, download the `Khoj_<version>_amd64.dmg` file.
2. Open the downloaded file and drag the Khoj app to your Applications folder.
## Windows
Make sure you meet the prerequisites for Windows installation. You can find them [here](windows_install.md#prerequisites).
1. Download the latest release from [here](https://github.com/khoj-ai/khoj/releases). You'll want the `khoj_<version>_amd64.exe` file.
2. Open the downloaded file and double click to install.
## Linux
For the Linux installation, you have to have `glibc` version 2.35 or higher. You can check your version with `ldd --version`.
1. Download the latest release from [here](https://github.com/khoj-ai/khoj/releases). You'll want the `khoj_<version>_amd64.deb` file.
2. In your downloads folder, run `sudo dpkg -i khoj_<version>_amd64.deb` to install Khoj.
# Uninstall
If you decide you want to uninstall the application, you can uninstall it like any other application on your system. For example, on MacOS, you can drag the application to the trash. On Windows, you can uninstall it from the `Add or Remove Programs` menu. On Linux, you can uninstall it with `sudo apt remove khoj`.
In addition to that, you might want to `rm -rf` the following directories:
- `~/.khoj`
- `~/.cache/gpt4all`

View File

@@ -1,133 +0,0 @@
# Development
## Setup
### Using Pip
#### 1. Install
```shell
# Get Khoj Code
git clone https://github.com/khoj-ai/khoj && cd khoj
# Create, Activate Virtual Environment
python3 -m venv .venv && source .venv/bin/activate
# Install Khoj for Development
pip install -e .[dev]
# For MacOS or zsh users run this
pip install -e .'[dev]'
```
#### 2. Run
1. Start Khoj
```shell
khoj -vv
```
2. Configure Khoj
- **Via the Settings UI**: Add files, directories to index the [Khoj settings](http://localhost:42110/config) UI once Khoj has started up. Once you've saved all your settings, click `Configure`.
- **Manually**:
- Copy the `config/khoj_sample.yml` to `~/.khoj/khoj.yml`
- Set `input-files` or `input-filter` in each relevant `content-type` section of `~/.khoj/khoj.yml`
- Set `input-directories` field in `image` `content-type` section
- Delete `content-type` and `processor` sub-section(s) irrelevant for your use-case
- Restart khoj
Note: Wait after configuration for khoj to Load ML model, generate embeddings and expose API to query notes, images, documents etc specified in config YAML
### Using Docker
#### 1. Clone
```shell
git clone https://github.com/khoj-ai/khoj && cd khoj
```
#### 2. Configure
- **Required**: Update [docker-compose.yml](https://github.com/khoj-ai/khoj/blob/master/docker-compose.yml) to mount your images, (org-mode or markdown) notes, PDFs and Github repositories
- **Optional**: Edit application configuration in [khoj_docker.yml](https://github.com/khoj-ai/khoj/blob/master/config/khoj_docker.yml)
#### 3. Run
```shell
docker-compose up -d
```
*Note: The first run will take time. Let it run, it\'s mostly not hung, just generating embeddings*
#### 4. Upgrade
```shell
docker-compose build --pull
```
## Validate
### Before Making Changes
1. Install Git Hooks for Validation
```shell
pre-commit install -t pre-push -t pre-commit
```
- This ensures standard code formatting fixes and other checks run automatically on every commit and push
- Note 1: If [pre-commit](https://pre-commit.com/#intro) didn't already get installed, [install it](https://pre-commit.com/#install) via `pip install pre-commit`
- Note 2: To run the pre-commit changes manually, use `pre-commit run --hook-stage manual --all` before creating PR
### Before Creating PR
1. Run Tests. If you get an error complaining about a missing `fast_tokenizer_file`, follow the solution [in this Github issue](https://github.com/UKPLab/sentence-transformers/issues/1659).
```shell
pytest
```
2. Run MyPy to check types
```shell
mypy --config-file pyproject.toml
```
### After Creating PR
- Automated [validation workflows](.github/workflows) run for every PR.
Ensure any issues seen by them our fixed
- Test the python packge created for a PR
1. Download and extract the zipped `.whl` artifact generated from the pypi workflow run for the PR.
2. Install (in your virtualenv) with `pip install /path/to/download*.whl>`
3. Start and use the application to see if it works fine
## Create Khoj Release
Follow the steps below to [release](https://github.com/debanjum/khoj/releases/) Khoj. This will create a stable release of Khoj on [Pypi](https://pypi.org/project/khoj-assistant/), [Melpa](https://stable.melpa.org/#%252Fkhoj) and [Obsidian](https://obsidian.md/plugins?id%253Dkhoj). It will also create desktop apps of Khoj and attach them to the latest release.
1. Create and tag release commit by running the bump_version script. The release commit sets version number in required metadata files.
```shell
./scripts/bump_version.sh -c "<release_version>"
```
2. Push commit and then the tag to trigger the release workflow to create Release with auto generated release notes.
```shell
git push origin master # push release commit to khoj repository
git push origin <release_version> # push release tag to khoj repository
```
3. [Optional] Update the Release Notes to highlight new features, fixes and updates
## Architecture
![](./assets/khoj_architecture.png)
## Visualize Codebase
*[Interactive Visualization](https://mango-dune-07a8b7110.1.azurestaticapps.net/?repo=debanjum%2Fkhoj)*
![](./assets/khoj_codebase_visualization_0.2.1.png)
## Visualize Khoj Obsidian Plugin Codebase
![](./assets/khoj_obsidian_codebase_visualization_0.2.1.png)
## Khoj Obsidian Plugin Implementation
The plugin implements the following functionality to search your notes with Khoj:
- [X] Open the Khoj search modal via left ribbon icon or the *Khoj: Search* command
- [X] Render results as Markdown preview to improve readability
- [X] Configure Khoj via the plugin setting tab on the settings page
- Set Obsidian Vault to Index with Khoj. Defaults to all markdown, PDF files in current Vault
- Set URL of Khoj backend
- Set Number of Search Results to show in Search Modal
- [X] Allow reranking of result to improve search quality
- [X] Allow Finding notes similar to current note being viewed

View File

@@ -1,151 +0,0 @@
<h1><img src="./assets/khoj-logo-sideways-500.png" width="200" alt="Khoj Logo">Emacs</h1>
> An AI personal assistance for your digital brain
<img src="https://stable.melpa.org/packages/khoj-badge.svg" width="150" alt="Melpa Stable Badge">
<img src="https://melpa.org/packages/khoj-badge.svg" width="150" alt="Melpa Badge">
<img src="https://github.com/khoj-ai/khoj/actions/workflows/build_khoj_el.yml/badge.svg" width="150" alt="Build Badge">
<img src="https://github.com/khoj-ai/khoj/actions/workflows/test_khoj_el.yml/badge.svg" width="150" alt="Test Badge">
## Features
- **Search**
- **Natural**: Advanced natural language understanding using Transformer based ML Models
- **Local**: Your personal data stays local. All search, indexing is done on your machine*
- **Incremental**: Incremental search for a fast, search-as-you-type experience
- **Chat**
- **Faster answers**: Find answers faster than search
- **Iterative discovery**: Iteratively explore and (re-)discover your notes
- **Assisted creativity**: Smoothly weave across answer retrieval and content generation
## Interface
#### Search
![khoj search on emacs](./assets/khoj_search_on_emacs.png ':size=400px')
#### Chat
![khoj chat on emacs](./assets/khoj_chat_on_emacs.png ':size=400px')
## Setup
- *Make sure [python](https://realpython.com/installing-python/) and [pip](https://pip.pypa.io/en/stable/installation/) are installed on your machine*
- *khoj.el attempts to automatically install, start and configure the khoj server.*
If this fails, follow [these instructions](/setup) to manually setup the khoj server.
### Direct Install
```elisp
M-x package-install khoj
```
### Minimal Install
Add below snippet to your Emacs config file.
Indexes your org-agenda files, by default.
```elisp
;; Install Khoj Package from MELPA Stable
(use-package khoj
:ensure t
:pin melpa-stable
:bind ("C-c s" . 'khoj)
```
- Note: Install `khoj.el` from MELPA (instead of MELPA Stable) if you installed the pre-release version of khoj
- That is, use `:pin melpa` to install khoj.el in above snippet if khoj server was installed with `--pre` flag, i.e `pip install --pre khoj-assistant`
- Else use `:pin melpa-stable` to install khoj.el in above snippet if khoj was installed with `pip install khoj-assistant`
- This ensures both khoj.el and khoj app are from the same version (git tagged or latest)
### Standard Install
Add below snippet to your Emacs config file.
Indexes the specified org files, directories. Sets up OpenAI API key for Khoj Chat
```elisp
;; Install Khoj Package from MELPA Stable
(use-package khoj
:ensure t
:pin melpa-stable
:bind ("C-c s" . 'khoj)
:config (setq khoj-org-directories '("~/docs/org-roam" "~/docs/notes")
khoj-org-files '("~/docs/todo.org" "~/docs/work.org")
khoj-openai-api-key "YOUR_OPENAI_API_KEY")) ; required to enable chat
```
### With [Straight.el](https://github.com/raxod502/straight.el)
Add below snippet to your Emacs config file.
Indexes the specified org files, directories. Sets up OpenAI API key for Khoj Chat
```elisp
;; Install Khoj Package using Straight.el
(use-package khoj
:after org
:straight (khoj :type git :host github :repo "khoj-ai/khoj" :files (:defaults "src/interface/emacs/khoj.el"))
:bind ("C-c s" . 'khoj)
:config (setq khoj-org-directories '("~/docs/org-roam" "~/docs/notes")
khoj-org-files '("~/docs/todo.org" "~/docs/work.org")
khoj-openai-api-key "YOUR_OPENAI_API_KEY" ; required to enable chat)
```
## Use
### Search
1. Hit `C-c s s` (or `M-x khoj RET s`) to open khoj search
2. Enter your query in natural language
e.g "What is the meaning of life?", "My life goals for 2023"
### Chat
1. Hit `C-c s c` (or `M-x khoj RET c`) to open khoj chat
2. Ask questions in a natural, conversational style
E.g "When did I file my taxes last year?"
See [Khoj Chat](/#/chat) for more details
### Find Similar Entries
This feature finds entries similar to the one you are currently on.
1. Move cursor to the org-mode entry, markdown section or text paragraph you want to find similar entries for
2. Hit `C-c s f` (or `M-x khoj RET f`) to find similar entries
### Advanced Usage
- Add [query filters](https://github.com/khoj-ai/khoj/#query-filters) during search to narrow down results further
e.g `What is the meaning of life? -"god" +"none" dt>"last week"`
- Use `C-c C-o 2` to open the current result at cursor in its source org file
- This calls `M-x org-open-at-point` on the current entry and opens the second link in the entry.
- The second link is the entries [org-id](https://orgmode.org/manual/Handling-Links.html#FOOT28), if set, or the heading text.
The first link is the line number of the entry in the source file. This link is less robust to file changes.
- Note: If you have [speed keys](https://orgmode.org/manual/Speed-Keys.html) enabled, `o 2` will also work
### Khoj Menu
![](./assets/khoj_emacs_menu.png)
Hit `C-c s` (or `M-x khoj`) to open the khoj menu above. Then:
- Hit `t` until you preferred content type is selected in the khoj menu
`Content Type` specifies the content to perform `Search`, `Update` or `Find Similar` actions on
- Hit `n` twice and then enter number of results you want to see
`Results Count` is used by the `Search` and `Find Similar` actions
- Hit `-f u` to `force` update the khoj content index
The `Force Update` switch is only used by the `Update` action
## Upgrade
### Upgrade Khoj Backend
```bash
pip install --upgrade khoj-assistant
```
### Upgrade Khoj.el
Use your Emacs package manager to upgrade `khoj.el`
- For `khoj.el` from MELPA
- Method 1
- Run `M-x package-list-packages` to list all packages
- Press `U` on `khoj` to mark it for upgrade
- Press `x` to execute the marked actions
- Method 2
- Run `M-x package-refresh-content`
- Run `M-x package-reinstall khoj`
- For `khoj.el` from Straight
- Run `M-x straight-pull-package khoj`

View File

@@ -1,34 +0,0 @@
## Features
#### [Search](./search.md)
- **Local**: Your personal data stays local. All search and indexing is done on your machine.
- **Incremental**: Incremental search for a fast, search-as-you-type experience
#### [Chat](./chat.md)
- **Faster answers**: Find answers faster, smoother than search. No need to manually scan through your notes to find answers.
- **Iterative discovery**: Iteratively explore and (re-)discover your notes
- **Assisted creativity**: Smoothly weave across answers retrieval and content generation
#### General
- **Natural**: Advanced natural language understanding using Transformer based ML Models
- **Pluggable**: Modular architecture makes it easy to plug in new data sources, frontends and ML models
- **Multiple Sources**: Index your Org-mode and Markdown notes, PDF files, Github repositories, and Photos
- **Multiple Interfaces**: Interact from your [Web Browser](./web.md), [Emacs](./emacs.md) or [Obsidian](./obsidian.md)
### Supported Interfaces
[![Khoj on Emacs](https://img.shields.io/badge/Emacs-%237F5AB6.svg?&style=for-the-badge&logo=gnu-emacs&logoColor=white)](./emacs.md)
<span>&nbsp;</span>
[![Khoj on Obsidian](https://img.shields.io/badge/Obsidian-%23483699.svg?style=for-the-badge&logo=obsidian&logoColor=white)](./obsidian.md)
### Supported Data Sources
- markdown*
- org-mode*
- pdf*
- images*
- [github](./github_integration.md)
- [notion](./notion_integration.md)
\* These data sources are offline only.
If you're using Github or Notion, you can get on a waitlist for [Khoj Cloud](https://khoj.dev).

View File

@@ -1,36 +0,0 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Document</title>
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1" />
<meta name="description" content="Description">
<meta name="viewport" content="width=device-width, initial-scale=1.0, minimum-scale=1.0">
<link rel="stylesheet" href="//cdn.jsdelivr.net/npm/docsify/lib/themes/buble.css" />
<link rel="icon" href="./assets/favicon-128x128.ico">
</head>
<body>
<div id="app"></div>
<script>
window.$docsify = {
name: 'Khoj',
repo: 'https://github.com/khoj-ai/khoj',
loadSidebar: true,
themeColor: '#c2a600',
// coverpage: true,
}
</script>
<!-- Docsify v4 -->
<script src="//cdn.jsdelivr.net/npm/docsify@4"></script>
<script src="//cdn.jsdelivr.net/npm/docsify/lib/plugins/search.min.js"></script>
<script src="//cdn.jsdelivr.net/npm/docsify-copy-code/dist/docsify-copy-code.min.js"></script>
<script src="//cdn.jsdelivr.net/npm/prismjs@1/components/prism-bash.min.js"></script>
<script src="//cdn.jsdelivr.net/npm/prismjs@1/components/prism-diff.min.js"></script>
<script defer data-domain="khoj.dev" src="https://plausible.io/js/script.js"></script>
</body>
<style>
video {
max-width: 800px;
}
</style>
</html>

View File

@@ -1,14 +0,0 @@
## 📜 Notion Integration
Khoj now supports search/chat with pages in your Notion workspaces. [Notion](notion.so/) is a platform people use for taking notes, especially for collaboration.
We haven't setup a fancy integration with OAuth yet, so this integration still requires some effort on your end to generate an API key.
1. Go to https://www.notion.so/my-integrations and create a new integration called Khoj to get an API key.
![setup_new_integration](https://github.com/khoj-ai/khoj/assets/65192171/b056e057-d4dc-47dc-aad3-57b59a22c68b)
3. Share all the workspaces that you want to integrate with the Khoj integration you just made in the previous step
![enable_workspace](https://github.com/khoj-ai/khoj/assets/65192171/98290303-b5b8-4cb0-b32c-f68c6923a3d0)
4. In the first step, you generated an API key. Use the newly generated API Key in your Khoj settings, by default at http://localhost:42110/config/content_type/notion. Click `Save`.
5. Click `Configure` in http://localhost:42110/config to index your Notion workspace(s).
That's it! You should be ready to start searching and chatting. Make sure you've configured your OpenAI API Key for chat.

View File

@@ -1,119 +0,0 @@
<h1><img src="./assets/khoj-logo-sideways-500.png" width="200" alt="Khoj Logo">Obsidian</h1>
> An AI personal assistant for your Digital Brain in Obsidian
## Features
- **Search**
- **Natural**: Advanced natural language understanding using Transformer based ML Models
- **Local**: Your personal data stays local. All search and indexing is done on your machine. *Unlike chat which requires access to GPT.*
- **Incremental**: Incremental search for a fast, search-as-you-type experience
- **Chat**
- **Faster answers**: Find answers faster and with less effort than search
- **Iterative discovery**: Iteratively explore and (re-)discover your notes
- **Assisted creativity**: Smoothly weave across answers retrieval and content generation
## Interface
![](./assets/khoj_search_on_obsidian.png ':size=400px')
![](./assets/khoj_chat_on_obsidian.png ':size=400px')
## Setup
- *Make sure [python](https://realpython.com/installing-python/) and [pip](https://pip.pypa.io/en/stable/installation/) are installed on your machine*
- *Ensure you follow the ordering of the setup steps. Install the plugin after starting the khoj backend. This allows the plugin to configure the khoj backend*
### 1. Setup Backend
Open terminal/cmd and run below command to install and start the khoj backend
- On Linux/MacOS
```shell
python -m pip install khoj-assistant && khoj
```
- On Windows
```shell
py -m pip install khoj-assistant && khoj
```
### 2. Setup Plugin
1. Open [Khoj](https://obsidian.md/plugins?id=khoj) from the *Community plugins* tab in Obsidian settings panel
2. Click *Install*, then *Enable* on the Khoj plugin page in Obsidian
3. [Optional] To enable Khoj Chat, set your [OpenAI API key](https://platform.openai.com/account/api-keys) in the Khoj plugin settings
See [official Obsidian plugin docs](https://help.obsidian.md/Extending+Obsidian/Community+plugins) for details
## Use
### Chat
Run *Khoj: Chat* from the [Command Palette](https://help.obsidian.md/Plugins/Command+palette) and ask questions in a natural, conversational style.<br />
E.g "When did I file my taxes last year?"
Notes:
- *Using Khoj Chat will result in query relevant notes being shared with OpenAI for ChatGPT to respond.*
- *To use Khoj Chat, ensure you've set your [OpenAI API key](https://platform.openai.com/account/api-keys) in the Khoj plugin settings.*
See [Khoj Chat](/chat) for more details
### Search
Click the *Khoj search* icon 🔎 on the [Ribbon](https://help.obsidian.md/User+interface/Workspace/Ribbon) or run *Khoj: Search* from the [Command Palette](https://help.obsidian.md/Plugins/Command+palette)
*Note: Ensure the khoj server is running in the background before searching. Execute `khoj` in your terminal if it is not already running*
[search_demo](https://user-images.githubusercontent.com/6413477/218801155-cd67e8b4-a770-404a-8179-d6b61caa0f93.mp4 ':include :type=mp4')
#### Query Filters
Use structured query syntax to filter the natural language search results
- **Word Filter**: Get entries that include/exclude a specified term
- Entries that contain term_to_include: `+"term_to_include"`
- Entries that contain term_to_exclude: `-"term_to_exclude"`
- **Date Filter**: Get entries containing dates in YYYY-MM-DD format from specified date (range)
- Entries from April 1st 1984: `dt:"1984-04-01"`
- Entries after March 31st 1984: `dt>="1984-04-01"`
- Entries before April 2nd 1984 : `dt<="1984-04-01"`
- **File Filter**: Get entries from a specified file
- Entries from incoming.org file: `file:"incoming.org"`
- Combined Example
- `what is the meaning of life? file:"1984.org" dt>="1984-01-01" dt<="1985-01-01" -"big" -"brother"`
- Adds all filters to the natural language query. It should return entries
- from the file *1984.org*
- containing dates from the year *1984*
- excluding words *"big"* and *"brother"*
- that best match the natural language query *"what is the meaning of life?"*
### Find Similar Notes
To see other notes similar to the current one, run *Khoj: Find Similar Notes* from the [Command Palette](https://help.obsidian.md/Plugins/Command+palette)
## Upgrade
### 1. Upgrade Backend
```shell
pip install --upgrade khoj-assistant
```
### 2. Upgrade Plugin
1. Open *Community plugins* tab in Obsidian settings
2. Click the *Check for updates* button
3. Click the *Update* button next to Khoj, if available
## Demo
### Search Demo
[demo](https://github-production-user-asset-6210df.s3.amazonaws.com/6413477/240061700-3e33d8ea-25bb-46c8-a3bf-c92f78d0f56b.mp4 ':include :type=mp4')
#### Description
1. Install Khoj via `pip` and start Khoj backend
```shell
python -m pip install khoj-assistant && khoj
```
2. Install Khoj plugin via Community Plugins settings pane on Obsidian app
- Check the new Khoj plugin settings
- Wait for Khoj backend to index markdown, PDF files in the current Vault
- Open Khoj plugin on Obsidian via Search button on Left Pane
- Search \"*Announce plugin to folks*\" in the [Obsidian Plugin docs](https://marcus.se.net/obsidian-plugin-docs/)
- Jump to the [search result](https://marcus.se.net/obsidian-plugin-docs/publishing/submit-your-plugin)
## Troubleshooting
- Open the Khoj plugin settings pane, to configure Khoj
- Toggle Enable/Disable Khoj, if setting changes have not applied
- Click *Update* button to force index to refresh, if results are failing or stale
## Current Limitations
- The plugin loads the index of only one vault at a time.<br/>
So notes across multiple vaults **cannot** be searched at the same time

View File

@@ -1,129 +0,0 @@
## Setup
These are the general setup instructions for Khoj.
- Make sure [python](https://realpython.com/installing-python/) and [pip](https://pip.pypa.io/en/stable/installation/) are installed on your machine
- Check the [Khoj Emacs docs](/emacs?id=setup) to setup Khoj with Emacs<br />
Its simpler as it can skip the server *install*, *run* and *configure* step below.
- Check the [Khoj Obsidian docs](/obsidian?id=_2-setup-plugin) to setup Khoj with Obsidian<br />
Its simpler as it can skip the *configure* step below.
### 1. Install
#### 1.1 Local Server Setup
Run the following command in your terminal to install the Khoj backend.
- On Linux/MacOS
```shell
python -m pip install khoj-assistant
```
- On Windows
```shell
py -m pip install khoj-assistant
```
For more detailed Windows installation and troubleshooting, see [Windows Install](./windows_install.md).
##### 1.1.1 Local Server Start
Run the following command from your terminal to start the Khoj backend and open Khoj in your browser.
```shell
khoj
```
Khoj should now be running at http://localhost:42110. You can see the web UI in your browser.
Note: To start Khoj automatically in the background use [Task scheduler](https://www.windowscentral.com/how-create-automated-task-using-task-scheduler-windows-10) on Windows or [Cron](https://en.wikipedia.org/wiki/Cron) on Mac, Linux (e.g with `@reboot khoj`)
#### 1.2 Local Docker Setup
Use the sample docker-compose [in Github](https://github.com/khoj-ai/khoj/blob/master/docker-compose.yml) to run Khoj in Docker. To start the container, run the following command in the same directory as the docker-compose.yml file. You'll have to configure the mounted directories to match your local knowledge base.
```shell
docker-compose up
```
Khoj should now be running at http://localhost:42110. You can see the web UI in your browser.
### 1.2 Select files for indexing using the desktop client [Optional]
You can use our desktop executables to select file paths and folders to index. You can simply select the folders or files, and they'll be automatically uploaded to the server. Once you specify a file or file path, you don't need to update the configuration again; it will grab any data diffs dynamically over time. This part is currently optional, but may make setup and configuration slightly easier. It removes the need for setting up custom file paths for your Khoj data configurations.
**To download the desktop client, go to https://download.khoj.dev** and the correct executable for your OS will automatically start downloading. Once downloaded, you can configure your folders for indexing using the settings tab. To set your chat configuration, you'll have to use the web interface for the Khoj server you setup in the previous step.
### 1.3 Use (deprecated) desktop builds
Before v0.11.4, we had self-contained desktop builds that included both the server and the client. These were difficult to maintain, but are still available as part of earlier releases. To find setup instructions, see here:
- [Desktop Installation](desktop_installation.md)
- [Windows Installation](windows_install.md)
### 2. Configure
1. Set `File`, `Folder` and hit `Save` in each Plugins you want to enable for Search on the Khoj config page
2. Add your OpenAI API key to Chat Feature settings if you want to use Chat
3. Click `Configure` and wait. The app will download ML models and index the content for search and (optionally) chat
![configure demo](https://user-images.githubusercontent.com/6413477/255307879-61247d3f-c69a-46ef-b058-9bc533cb5c72.mp4 ':include :type=mp4')
### 3. Install Interface Plugins (Optional)
Khoj exposes a web interface to search, chat and configure by default.<br />
The optional steps below allow using Khoj from within an existing application like Obsidian or Emacs.
- **Khoj Obsidian**:<br />
[Install](/obsidian?id=_2-setup-plugin) the Khoj Obsidian plugin
- **Khoj Emacs**:<br />
[Install](/emacs?id=setup) khoj.el
## Upgrade
### Upgrade Khoj Server
```shell
pip install --upgrade khoj-assistant
```
*Note: To upgrade to the latest pre-release version of the khoj server run below command*
```shell
# Maps to the latest commit on the master branch
pip install --upgrade --pre khoj-assistant
```
### Upgrade Khoj on Emacs
- Use your Emacs Package Manager to Upgrade
- See [khoj.el package setup](/emacs?id=setup) for details
### Upgrade Khoj on Obsidian
- Upgrade via the Community plugins tab on the settings pane in the Obsidian app
- See the [khoj plugin setup](/obsidian.md?id=_2-setup-plugin) for details
## Uninstall
1. (Optional) Hit `Ctrl-C` in the terminal running the khoj server to stop it
2. Delete the khoj directory in your home folder (i.e `~/.khoj` on Linux, Mac or `C:\Users\<your-username>\.khoj` on Windows)
5. You might want to `rm -rf` the following directories:
- `~/.khoj`
- `~/.cache/gpt4all`
3. Uninstall the khoj server with `pip uninstall khoj-assistant`
4. (Optional) Uninstall khoj.el or the khoj obsidian plugin in the standard way on Emacs, Obsidian
## Troubleshoot
#### Install fails while building Tokenizer dependency
- **Details**: `pip install khoj-assistant` fails while building the `tokenizers` dependency. Complains about Rust.
- **Fix**: Install Rust to build the tokenizers package. For example on Mac run:
```shell
brew install rustup
rustup-init
source ~/.cargo/env
```
- **Refer**: [Issue with Fix](https://github.com/khoj-ai/khoj/issues/82#issuecomment-1241890946) for more details
#### Search starts giving wonky results
- **Fix**: Open [/api/update?force=true](http://localhost:42110/api/update?force=true) in browser to regenerate index from scratch
- **Note**: *This is a fix for when you perceive the search results have degraded. Not if you think they've always given wonky results*
#### Khoj in Docker errors out with \"Killed\" in error message
- **Fix**: Increase RAM available to Docker Containers in Docker Settings
- **Refer**: [StackOverflow Solution](https://stackoverflow.com/a/50770267), [Configure Resources on Docker for Mac](https://docs.docker.com/desktop/mac/#resources)
#### Khoj errors out complaining about Tensors mismatch or null
- **Mitigation**: Disable `image` search using the desktop GUI

View File

@@ -1,20 +0,0 @@
<h1><img src="./assets/khoj-logo-sideways-500.png" width="200" alt="Khoj Logo">Web</h1>
> An AI personal assistant for your Digital Brain
## Features
- **Search**
- **Natural**: Advanced natural language understanding using Transformer based ML Models
- **Local**: Your personal data stays local. All search and indexing is done on your machine. *Unlike chat which requires access to GPT.*
- **Incremental**: Incremental search for a fast, search-as-you-type experience
- **Chat**
- **Faster answers**: Find answers faster and with less effort than search
- **Iterative discovery**: Iteratively explore and (re-)discover your notes
- **Assisted creativity**: Smoothly weave across answers retrieval and content generation
## Setup
The Khoj web interface is the default interface. It comes packaged with the khoj server.
## Interface
![](./assets/khoj_search_on_web.png ':size=400px')
![](./assets/khoj_chat_on_web.png ':size=400px')

View File

@@ -1,23 +0,0 @@
# Windows Installation
These steps can be used to setup Khoj on a clean, new Windows 11 machine. It has been tested on a Windows VM
## Prerequisites
1. Ensure you have Visual Studio C++ Build tools installed. You can download it [from Microsoft here](https://visualstudio.microsoft.com/visual-cpp-build-tools/). At the minimum, you should have the following configuration:
<img width="1152" alt="Screenshot 2023-07-12 at 3 56 25 PM" src="https://github.com/khoj-ai/khoj/assets/65192171/b506a858-2f5e-4c85-946b-5422d83f112a">
2. Ensure you have Python installed. You can check by running `python --version`. If you don't, install the latest version [from here](https://www.python.org/downloads/).
- Ensure you have pip installed: `py -m ensurepip --upgrade`.
## Quick start
1. Open a PowerShell terminal.
2. Run `pip install khoj-assistant`
3. Start Khoj with `khoj`
## Installation in a Virtual Environment
Use this if you want to install with a virtual environment. This will make it much easier to manage your dependencies. You can read more about [virtual environments](https://packaging.python.org/en/latest/guides/installing-using-pip-and-virtual-environments/) here.
1. Open a PowerShell terminal with the `Run as Administrator` privileges.
2. Create a virtual environment: `mkdir khoj && cd khoj && py -m venv .venv`
3. Activate the virtual environment: `.\.venv\Scripts\activate`. If you get a permissions error, then run `Set-ExecutionPolicy -ExecutionPolicy RemoteSigned`.
4. Run `pip install khoj-assistant`
5. Start Khoj with `khoj`

20
documentation/.gitignore vendored Normal file
View File

@@ -0,0 +1,20 @@
# Dependencies
/node_modules
# Production
/build
# Generated files
.docusaurus
.cache-loader
# Misc
.DS_Store
.env.local
.env.development.local
.env.test.local
.env.production.local
npm-debug.log*
yarn-debug.log*
yarn-error.log*

41
documentation/README.md Normal file
View File

@@ -0,0 +1,41 @@
# Website
This website is built using [Docusaurus](https://docusaurus.io/), a modern static website generator.
### Installation
```
$ yarn
```
### Local Development
```
$ yarn start
```
This command starts a local development server and opens up a browser window. Most changes are reflected live without having to restart the server.
### Build
```
$ yarn build
```
This command generates static content into the `build` directory and can be served using any static contents hosting service.
### Deployment
Using SSH:
```
$ USE_SSH=true yarn deploy
```
Not using SSH:
```
$ GIT_USER=<Your GitHub username> yarn deploy
```
If you are using GitHub pages for hosting, this command is a convenient way to build the website and push to the `gh-pages` branch.

Binary file not shown.

After

Width:  |  Height:  |  Size: 89 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 66 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.0 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 170 KiB

View File

Before

Width:  |  Height:  |  Size: 13 KiB

After

Width:  |  Height:  |  Size: 13 KiB

View File

Before

Width:  |  Height:  |  Size: 36 KiB

After

Width:  |  Height:  |  Size: 36 KiB

View File

Before

Width:  |  Height:  |  Size: 1.2 MiB

After

Width:  |  Height:  |  Size: 1.2 MiB

View File

Before

Width:  |  Height:  |  Size: 350 KiB

After

Width:  |  Height:  |  Size: 350 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 298 KiB

View File

Before

Width:  |  Height:  |  Size: 302 KiB

After

Width:  |  Height:  |  Size: 302 KiB

View File

Before

Width:  |  Height:  |  Size: 394 KiB

After

Width:  |  Height:  |  Size: 394 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 187 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 27 KiB

View File

Before

Width:  |  Height:  |  Size: 544 KiB

After

Width:  |  Height:  |  Size: 544 KiB

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 43 KiB

View File

Before

Width:  |  Height:  |  Size: 49 KiB

After

Width:  |  Height:  |  Size: 49 KiB

View File

Before

Width:  |  Height:  |  Size: 333 KiB

After

Width:  |  Height:  |  Size: 333 KiB

View File

Before

Width:  |  Height:  |  Size: 445 KiB

After

Width:  |  Height:  |  Size: 445 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 333 KiB

View File

Before

Width:  |  Height:  |  Size: 420 KiB

After

Width:  |  Height:  |  Size: 420 KiB

View File

Before

Width:  |  Height:  |  Size: 478 KiB

After

Width:  |  Height:  |  Size: 478 KiB

View File

Before

Width:  |  Height:  |  Size: 268 KiB

After

Width:  |  Height:  |  Size: 268 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 71 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 8.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.0 MiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 103 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 119 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 265 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 31 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 9.4 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 94 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 32 MiB

View File

@@ -0,0 +1,3 @@
module.exports = {
presets: [require.resolve('@docusaurus/core/lib/babel/preset')],
};

View File

@@ -0,0 +1,8 @@
{
"label": "Advanced Self Hosting",
"position": 6,
"link": {
"type": "generated-index",
"description": "Advanced setup for Self Hosting Khoj server"
}
}

View File

@@ -0,0 +1,52 @@
# Authenticate
:::info
This is only helpful for self-hosted users or teams. If you're using [Khoj Cloud](https://app.khoj.dev), both Magic Links and Google OAuth work.
:::
By default, most of the instructions for self-hosting Khoj assume a single user, and so the default configuration is to run in anonymous mode. However, if you want to enable authentication, you can do so either with with [Magic Links](#using-magic-links) or [Google OAuth](#using-google-oauth) as shown below. This can be helpful to make Khoj securely accessible to you and your team.
:::tip[Note]
Remove the `--anonymous-mode` flag in your start up command to enable authentication.
:::
## Using Magic Links
The most secure way to do this is to integrate with [Resend](https://resend.com) by setting up an account and adding an environment variable for `RESEND_API_KEY`. You can get your API key [here](https://resend.com/api-keys). This will allow you to automatically send sign-in links to users who want to log in.
It's still possible to use the magic links feature without Resend, but you'll need to manually send the magic links to users who want to log in.
## Manually sending magic links
1. The user will have to enter their email address in the login form.
They'll click `Send Magic Link`. Without the Resend API key, this will just create an unverified account for them in the backend
<img src="/img/magic_link.png" alt="Magic link login form" width="400"/>
2. You can get their magic link using the admin panel
Go to the [admin panel](http://localhost:42110/server/admin/database/khojuser/). You'll see a list of users. Search for the user you want to send a magic link to. Tick the checkbox next to their row, and use the action drop down at the top to 'Get email login URL'. This will generate a magic link that you can send to the user, which will appear at the top of the admin interface.
| Get email login URL | Retrieved login URL |
|---------------------|---------------------|
| <img src="/img/admin_get_emali_login.png" alt="Get user magic sign in link" width="400" />| <img src="/img/admin_successful_login_url.png" alt="Successfully retrieved a login URL" width="400" />|
3. Send the magic link to the user. They can click on it to log in.
Once they click on the link, they'll automatically be logged in. They'll have to repeat this process for every new device they want to log in from, but they shouldn't have to repeat it on the same device.
A given magic link can only be used once. If the user tries to use it again, they'll be redirected to the login page to get a new magic link.
## Using Google OAuth
To set up your self-hosted Khoj with Google Auth, you need to create a project in the Google Cloud Console and enable the Google Auth API.
To implement this, you'll need to:
1. You must use the `python` package or build from source, because you'll need to install additional packages for the google auth libraries (`prod`). The syntax to install the right packages is
```
pip install khoj[prod]
```
2. [Create authorization credentials](https://developers.google.com/identity/sign-in/web/sign-in) for your application.
3. Open your [Google cloud console](https://console.developers.google.com/apis/credentials) and create a configuration like below for the relevant `OAuth 2.0 Client IDs` project:
![Google auth login project settings](https://github.com/khoj-ai/khoj/assets/65192171/9bcbf6f4-197d-4d0c-973a-c10b1331c892)
4. Configure these environment variables: `GOOGLE_CLIENT_SECRET`, and `GOOGLE_CLIENT_ID`. You can find these values in the Google cloud console, in the same place where you configured the authorized origins and redirect URIs.
That's it! That should be all you have to do. Now, when you reload Khoj without `--anonymous-mode`, you should be able to use your Google account to sign in.

View File

@@ -0,0 +1,37 @@
# LiteLLM
:::info
This is only helpful for self-hosted users. If you're using [Khoj Cloud](https://app.khoj.dev), you're limited to our first-party models.
:::
:::info
Khoj natively supports local LLMs [available on HuggingFace in GGUF format](https://huggingface.co/models?library=gguf). Using an OpenAI API proxy with Khoj maybe useful for ease of setup, trying new models or using commercial LLMs via API.
:::
[LiteLLM](https://docs.litellm.ai/docs/proxy/quick_start) exposes an OpenAI compatible API that proxies requests to other LLM API services. This provides a standardized API to interact with both open-source and commercial LLMs.
Using LiteLLM with Khoj makes it possible to turn any LLM behind an API into your personal AI agent.
## Setup
1. Install LiteLLM
```bash
pip install litellm[proxy]
```
2. Start LiteLLM and use Mistral tiny via Mistral API
```
export MISTRAL_API_KEY=<MISTRAL_API_KEY>
litellm --model mistral/mistral-tiny --drop_params
```
3. Create a new [OpenAI Processor Conversation Config](http://localhost:42110/server/admin/database/openaiprocessorconversationconfig/add) on your Khoj admin panel
- Name: `proxy-name`
- Api Key: `any string`
- Api Base Url: **URL of your Openai Proxy API**
4. Create a new [Chat Model Option](http://localhost:42110/server/admin/database/chatmodeloptions/add) on your Khoj admin panel.
- Name: `llama3.1` (replace with the name of your local model)
- Model Type: `Openai`
- Openai Config: `<the proxy config you created in step 3>`
- Max prompt size: `20000` (replace with the max prompt size of your model)
- Tokenizer: *Do not set for OpenAI, Mistral, Llama3 based models*
5. Create a new [Server Chat Setting](http://localhost:42110/server/admin/database/serverchatsettings/add/) on your Khoj admin panel
- Default model: `<name of chat model option you created in step 4>`
- Summarizer model: `<name of chat model option you created in step 4>`
6. Go to [your config](http://localhost:42110/settings) and select the model you just created in the chat model dropdown.

View File

@@ -0,0 +1,30 @@
# LM Studio
:::info
This is only helpful for self-hosted users. If you're using [Khoj Cloud](https://app.khoj.dev), you're limited to our first-party models.
:::
:::info
Khoj natively supports local LLMs [available on HuggingFace in GGUF format](https://huggingface.co/models?library=gguf). Using an OpenAI API proxy with Khoj maybe useful for ease of setup, trying new models or using commercial LLMs via API.
:::
[LM Studio](https://lmstudio.ai/) is a desktop app to chat with open-source LLMs on your local machine. LM Studio provides a neat interface for folks comfortable with a GUI.
LM Studio can expose an [OpenAI API compatible server](https://lmstudio.ai/docs/local-server). This makes it possible to turn chat models from LM Studio into your personal AI agents with Khoj.
## Setup
1. Install [LM Studio](https://lmstudio.ai/) and download your preferred Chat Model
2. Go to the Server Tab on LM Studio, Select your preferred Chat Model and Click the green Start Server button
3. Create a new [OpenAI Processor Conversation Config](http://localhost:42110/server/admin/database/openaiprocessorconversationconfig/add) on your Khoj admin panel
- Name: `proxy-name`
- Api Key: `any string`
- Api Base Url: `http://localhost:1234/v1/` (default for LMStudio)
4. Create a new [Chat Model Option](http://localhost:42110/server/admin/database/chatmodeloptions/add) on your Khoj admin panel.
- Name: `llama3.1` (replace with the name of your local model)
- Model Type: `Openai`
- Openai Config: `<the proxy config you created in step 3>`
- Max prompt size: `20000` (replace with the max prompt size of your model)
- Tokenizer: *Do not set for OpenAI, mistral, llama3 based models*
5. Create a new [Server Chat Setting](http://localhost:42110/server/admin/database/serverchatsettings/add/) on your Khoj admin panel
- Default model: `<name of chat model option you created in step 4>`
- Summarizer model: `<name of chat model option you created in step 4>`
6. Go to [your config](http://localhost:42110/settings) and select the model you just created in the chat model dropdown.

View File

@@ -0,0 +1,36 @@
# Ollama
:::info
This is only helpful for self-hosted users. If you're using [Khoj Cloud](https://app.khoj.dev), you're limited to our first-party models.
:::
:::info
Khoj natively supports local LLMs [available on HuggingFace in GGUF format](https://huggingface.co/models?library=gguf). Using an OpenAI API proxy with Khoj maybe useful for ease of setup, trying new models or using commercial LLMs via API.
:::
Ollama allows you to run [many popular open-source LLMs](https://ollama.com/library) locally from your terminal.
For folks comfortable with the terminal, Ollama's terminal based flows can ease setup and management of chat models.
Ollama exposes a local [OpenAI API compatible server](https://github.com/ollama/ollama/blob/main/docs/openai.md#models). This makes it possible to use chat models from Ollama to create your personal AI agents with Khoj.
## Setup
1. Setup Ollama: https://ollama.com/
2. Start your preferred model with Ollama. For example,
```bash
ollama run llama3.1
```
3. Create a new [OpenAI Processor Conversation Config](http://localhost:42110/server/admin/database/openaiprocessorconversationconfig/add) on your Khoj admin panel
- Name: `ollama`
- Api Key: `any string`
- Api Base Url: `http://localhost:11434/v1/` (default for Ollama)
4. Create a new [Chat Model Option](http://localhost:42110/server/admin/database/chatmodeloptions/add) on your Khoj admin panel.
- Name: `llama3.1` (replace with the name of your local model)
- Model Type: `Openai`
- Openai Config: `<the ollama config you created in step 3>`
- Max prompt size: `20000` (replace with the max prompt size of your model)
5. Create a new [Server Chat Setting](http://localhost:42110/server/admin/database/serverchatsettings/add/) on your Khoj admin panel
- Default model: `<name of chat model option you created in step 4>`
- Summarizer model: `<name of chat model option you created in step 4>`
6. Go to [your config](http://localhost:42110/settings) and select the model you just created in the chat model dropdown.
That's it! You should now be able to chat with your Ollama model from Khoj. If you want to add additional models running on Ollama, repeat step 6 for each model.

View File

@@ -0,0 +1,17 @@
# Support Multilingual Docs
Khoj uses an embedding model to understand documents. Multilingual embedding models improve the search quality for documents not in English. This affects both search and chat with docs experiences across Khoj.
To improve search and chat quality for non-english documents you can use a [multilingual model](https://www.sbert.net/docs/pretrained_models.html#multi-lingual-models).<br />
For example, the [paraphrase-multilingual-MiniLM-L12-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2) supports [50+ languages](https://www.sbert.net/docs/pretrained_models.html#:~:text=we%20used%20the%20following%2050%2B%20languages), has decent search quality and speed for a consumer machine.
To use it:
1. Open [the search config](http://localhost:42110/server/admin/database/searchmodelconfig/) on your server's admin settings page. Either create a new search model, if none exists, or update the existing one. For example,
- Set the `bi_encoder` field to `sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2`
- Set the `cross_encoder` field to `mixedbread-ai/mxbai-rerank-xsmall-v1`
2. Regenerate your content index from all the relevant clients. This step is very important, as you'll need to re-encode all your content with the new model.
:::info[Note]
Modern search/embedding model like [mixedbread-ai/mxbai-embed-large-v1](https://huggingface.co/mixedbread-ai/mxbai-embed-large-v1) expect a prefix to the query (or docs) string to improve encoding. Update the `bi_encoder_query_encode_config` field of your [embedding model](http://localhost:42110/server/admin/database/searchmodelconfig/) with `{prompt: <prefix-prompt>}` to improve the search quality of these models.
E.g. `{prompt: "Represent this query for searching documents"}`. You can pass any valid JSON object that the SentenceTransformer `encode` function accepts
:::

View File

@@ -0,0 +1,37 @@
---
sidebar_position: 1
---
# Use OpenAI Proxy
:::info
This is only helpful for self-hosted users. If you're using [Khoj Cloud](https://app.khoj.dev), you're limited to our first-party models.
:::
:::info
Khoj natively supports local LLMs [available on HuggingFace in GGUF format](https://huggingface.co/models?library=gguf). Using an OpenAI API proxy with Khoj maybe useful for ease of setup, trying new models or using commercial LLMs via API.
:::
Khoj can use any OpenAI API compatible server including [Ollama](/advanced/ollama), [LMStudio](/advanced/lmstudio) and [LiteLLM](/advanced/litellm).
Configuring this allows you to use non-standard, open or commercial, local or hosted LLM models for Khoj
Combine them with Khoj can turn your favorite LLM into an AI agent. Allowing you to chat with your docs, find answers from the internet, build custom agents and run automations.
For specific integrations, see our [Ollama](/advanced/ollama), [LMStudio](/advanced/lmstudio) and [LiteLLM](/advanced/litellm) setup docs. For general instructions to setup Khoj with an OpenAI API proxy see below.
## General Setup
1. Start your preferred OpenAI API compatible app
3. Create a new [OpenAI Processor Conversation Config](http://localhost:42110/server/admin/database/openaiprocessorconversationconfig/add) on your Khoj admin panel
- Name: `proxy-name`
- Api Key: `any string`
- Api Base Url: **URL of your Openai Proxy API**
4. Create a new [Chat Model Option](http://localhost:42110/server/admin/database/chatmodeloptions/add) on your Khoj admin panel.
- Name: `llama3` (replace with the name of your local model)
- Model Type: `Openai`
- Openai Config: `<the proxy config you created in step 3>`
- Max prompt size: `2000` (replace with the max prompt size of your model)
- Tokenizer: *Do not set for OpenAI, mistral, llama3 based models*
5. Create a new [Server Chat Setting](http://localhost:42110/server/admin/database/serverchatsettings/add/) on your Khoj admin panel
- Default model: `<name of chat model option you created in step 4>`
- Summarizer model: `<name of chat model option you created in step 4>`
6. Go to [your config](http://localhost:42110/settings) and select the model you just created in the chat model dropdown.

View File

@@ -0,0 +1,8 @@
{
"label": "Clients",
"position": 4,
"link": {
"type": "generated-index",
"description": "Different ways for indexing data with the Khoj backend. To see online data sources, go to https://docs.khoj.dev/category/data-sources"
}
}

View File

@@ -0,0 +1,34 @@
---
sidebar_position: 1
---
# Desktop
> Query your Second Brain from your machine
Use the Desktop app to chat and search with Khoj.
You can also share your files, folders with Khoj using the app.
Khoj will keep these files in sync to provide contextual responses when you search or chat.
## Features
- **Chat**
- **Faster answers**: Find answers quickly, from your private notes or the public internet
- **Assisted creativity**: Smoothly weave across retrieving answers and generating content
- **Iterative discovery**: Iteratively explore and re-discover your notes
- **Quick access**: Use [Khoj Mini](/features/khoj_mini) on the desktop to quickly pull up a mini chat module for quicker answers
- **Search**
- **Natural**: Advanced natural language understanding using Transformer based ML Models
- **Incremental**: Incremental search for a fast, search-as-you-type experience
## Setup
1. Install the [Khoj Desktop app](https://khoj.dev/downloads) for your OS
2. Generate an API key on the [Khoj Web App](https://app.khoj.dev/settings#clients)
3. Set your Khoj API Key on the *Settings* page of the Khoj Desktop app
4. [Optional] Add any files, folders you'd like Khoj to be aware of on the *Settings* page and Click *Save*
These files and folders will be automatically kept in sync for you
## Interface
| Chat | Search |
|:----:|:------:|
| ![](/img/khoj_chat_on_desktop.png) | ![](/img/khoj_search_on_desktop.png) |

View File

@@ -0,0 +1,137 @@
---
sidebar_position: 2
---
# Emacs
<img src="https://stable.melpa.org/packages/khoj-badge.svg" width="130" alt="Melpa Stable Badge" />
<img src="https://melpa.org/packages/khoj-badge.svg" width="150" alt="Melpa Badge" />
<img src="https://github.com/khoj-ai/khoj/actions/workflows/build_khoj_el.yml/badge.svg" width="150" alt="Build Badge" />
<img src="https://github.com/khoj-ai/khoj/actions/workflows/test_khoj_el.yml/badge.svg" width="150" alt="Test Badge" />
<br />
<br />
> Query your Second Brain from Emacs
## Features
- **Chat**
- **Faster answers**: Find answers quickly, from your private notes or the public internet
- **Assisted creativity**: Smoothly weave across retrieving answers and generating content
- **Iterative discovery**: Iteratively explore and re-discover your notes
- **Search**
- **Natural**: Advanced natural language understanding using Transformer based ML Models
- **Incremental**: Incremental search for a fast, search-as-you-type experience
## Interface
| Search | Chat |
|:------:|:----:|
| ![khoj search on emacs](/img/khoj_search_on_emacs.png) | ![khoj chat on emacs](/img/khoj_chat_on_emacs.png) |
## Setup
1. Generate an API key on the [Khoj Web App](https://app.khoj.dev/settings#clients)
2. Add below snippet to your Emacs config file, usually at `~/.emacs.d/init.el`
#### **Direct Install**
*Khoj will index your org-agenda files, by default*
```elisp
;; Install Khoj.el
M-x package-install khoj
; Set your Khoj API key
(setq khoj-api-key "YOUR_KHOJ_CLOUD_API_KEY")
```
#### **Minimal Install**
*Khoj will index your org-agenda files, by default*
```elisp
;; Install Khoj client from MELPA Stable
(use-package khoj
:ensure t
:pin melpa-stable
:bind ("C-c s" . 'khoj)
:config (setq khoj-api-key "YOUR_KHOJ_CLOUD_API_KEY"))
```
#### **Standard Install**
*Configures the specified org files, directories to be indexed by Khoj*
```elisp
;; Install Khoj client from MELPA Stable
(use-package khoj
:ensure t
:pin melpa-stable
:bind ("C-c s" . 'khoj)
:config (setq khoj-api-key "YOUR_KHOJ_CLOUD_API_KEY"
khoj-org-directories '("~/docs/org-roam" "~/docs/notes")
khoj-org-files '("~/docs/todo.org" "~/docs/work.org")))
```
#### **Straight.el**
*Configures the specified org files, directories to be indexed by Khoj*
```elisp
;; Install Khoj client using Straight.el
(use-package khoj
:after org
:straight (khoj :type git :host github :repo "khoj-ai/khoj" :files (:defaults "src/interface/emacs/khoj.el"))
:bind ("C-c s" . 'khoj)
:config (setq khoj-api-key "YOUR_KHOJ_CLOUD_API_KEY"
khoj-org-directories '("~/docs/org-roam" "~/docs/notes")
khoj-org-files '("~/docs/todo.org" "~/docs/work.org")))
```
## Use
### Search
See [Khoj Search](/features/search) for details
1. Hit `C-c s s` (or `M-x khoj RET s`) to open khoj search
2. Enter your query in natural language<br/>
E.g. *"What is the meaning of life?"*, *"My life goals for 2023"*
### Chat
See [Khoj Chat](/features/chat) for details
1. Hit `C-c s c` (or `M-x khoj RET c`) to open khoj chat
2. Ask questions in a natural, conversational style<br/>
E.g. *"When did I file my taxes last year?"*
### Find Similar Entries
This feature finds entries similar to the one you are currently on.
1. Move cursor to the org-mode entry, markdown section or text paragraph you want to find similar entries for
2. Hit `C-c s f` (or `M-x khoj RET f`) to find similar entries
### Advanced Usage
- Add [query filters](https://github.com/khoj-ai/khoj/#query-filters) during search to narrow down results further
e.g. `What is the meaning of life? -"god" +"none" dt>"last week"`
- Use `C-c C-o 2` to open the current result at cursor in its source org file
- This calls `M-x org-open-at-point` on the current entry and opens the second link in the entry.
- The second link is the entries [org-id](https://orgmode.org/manual/Handling-Links.html#FOOT28), if set, or the heading text.
The first link is the line number of the entry in the source file. This link is less robust to file changes.
- Note: If you have [speed keys](https://orgmode.org/manual/Speed-Keys.html) enabled, `o 2` will also work
### Khoj Menu
![](/img/khoj_emacs_menu.png)
Hit `C-c s` (or `M-x khoj`) to open the khoj menu above. Then:
- Hit `t` until you preferred content type is selected in the khoj menu
`Content Type` specifies the content to perform `Search`, `Update` or `Find Similar` actions on
- Hit `n` twice and then enter number of results you want to see
`Results Count` is used by the `Search` and `Find Similar` actions
- Hit `-f u` to `force` update the khoj content index
The `Force Update` switch is only used by the `Update` action
## Upgrade
Use your Emacs package manager to upgrade `khoj.el`
<!-- tabs:start -->
#### **With MELPA**
1. Run `M-x package-refresh-content`
2. Run `M-x package-reinstall khoj`
#### **With Straight.el**
- Run `M-x straight-pull-package khoj`
<!-- tabs:end -->

View File

@@ -0,0 +1,56 @@
---
sidebar_position: 3
---
# Obsidian
> Query your Second Brain from Obsidian
![demo](https://assets.khoj.dev/obsidian_khoj_side_panel_pak_telemedicine.gif)
## Features
- **Chat**
- **Faster answers**: Find answers quickly, from your private notes or the public internet
- **Assisted creativity**: Smoothly weave across retrieving answers and generating content
- **Iterative discovery**: Iteratively explore and re-discover your notes
- **Search**
- **Natural**: Advanced natural language understanding using Transformer based ML Models
- **Incremental**: Incremental search for a fast, search-as-you-type experience
- **Similar**
- **Discover**: Find similar notes to the current one
## Setup
1. Open [Khoj](https://obsidian.md/plugins?id=khoj) from the *Community plugins* tab in Obsidian settings panel
2. Click *Install*, then *Enable* on the Khoj plugin page in Obsidian
3. Generate an API key on the [Khoj Web App](https://app.khoj.dev/settings#clients)
4. Set your Khoj API Key in the Khoj plugin settings in Obsidian
See the official [Obsidian Plugin Docs](https://help.obsidian.md/Extending+Obsidian/Community+plugins) for more details on installing Obsidian plugins.
## Use
### Chat
Click the *Khoj chat* icon 💬 on the [Ribbon](https://help.obsidian.md/User+interface/Workspace/Ribbon) or run *Khoj: Chat* from the [Command Palette](https://help.obsidian.md/Plugins/Command+palette) and ask questions in a natural, conversational style.<br />
E.g. *"When did I file my taxes last year?"*
See [Khoj Chat](/features/chat) for more details
### Find Similar Notes
To see other notes similar to the current one, run *Khoj: Find Similar Notes* from the [Command Palette](https://help.obsidian.md/Plugins/Command+palette)
### Search
Run *Khoj: Search* from the [Command Palette](https://help.obsidian.md/Plugins/Command+palette)
See [Khoj Search](/features/search) for more details. Use [query filters](/miscellaneous/advanced#query-filters) to limit entries to search
[search_demo](https://user-images.githubusercontent.com/6413477/218801155-cd67e8b4-a770-404a-8179-d6b61caa0f93.mp4 ':include :type=mp4')
## Upgrade
1. Open *Community plugins* tab in Obsidian settings
2. Click the *Check for updates* button
3. Click the *Update* button next to Khoj, if available
## Troubleshooting
- Open the Khoj plugin settings pane, to configure Khoj
- Toggle Enable/Disable Khoj, if setting changes have not applied
- Click *Update* button to force index to refresh, if results are failing or stale

View File

@@ -0,0 +1,45 @@
---
sidebar_position: 4
---
# Web
> Query your Second Brain from your Web Browser
Without any desktop clients, you can start chatting with Khoj on the web. Bear in mind you do need one of the desktop clients in order to share and sync your data with Khoj.
## Features
- **Chat**
- **Faster answers**: Find answers quickly, from your private notes or the public internet
- **Assisted creativity**: Smoothly weave across retrieving answers and generating content
- **Iterative discovery**: Iteratively explore and re-discover your notes
- **Search**
- **Natural**: Advanced natural language understanding using Transformer based ML Models
- **Incremental**: Incremental search for a fast, search-as-you-type experience
## Setup
No setup required. The Khoj web app is the default Khoj client. You can access it from any web browser. Try it on [Khoj Cloud](https://app.khoj.dev)
## Upload Documents
You can upload documents to Khoj from the web interface, one at a time. This is useful for uploading documents from your phone or tablet. To upload a document:
1. You can drag and drop the document into the chat window.
2. Or click the paperclip icon in the chat window and select the document from your file system.
![demo of dragging and dropping a file](https://assets.khoj.dev/drag_drop_file.gif)
### Install on Phone
You can optionally install Khoj as a [Progressive Web App (PWA)](https://web.dev/learn/pwa/installation). This makes it quick and easy to access Khoj on your phone.
1. Login to [Khoj Cloud](https://app.khoj.dev) or your self-hosted Khoj server from the web browser (prefer Chrome/Edge) on your phone
2. Open the 3 dot menu on the browser and click the "Add to Home screen" option
3. Click "Install" on the next screen to add the Khoj icon to your phone Home screen
**Process via Screenshots**
| Step 1 | Step 2 | Step 3|
|:---:|:---:|:---:|
| ![](/img/pwa_install_1.png) | ![](/img/pwa_install_2.png) | ![](/img/pwa_install_3.png) |
## Interface
| Search | Chat |
|:------:|:----:|
| ![](/img/khoj_search_on_web.png) | ![](/img/khoj_chat_on_web.png) |

View File

@@ -0,0 +1,30 @@
---
sidebar_position: 5
---
# WhatsApp
> Query your Second Brain from WhatsApp
Text [+1 (848) 800 4242](https://wa.me/18488004242) or scan the QQ code below on your phone to chat with Khoj on WhatsApp.
Without any desktop clients, you can start chatting with Khoj on WhatsApp. Bear in mind you do need one of the desktop clients in order to share and sync your data with Khoj. The WhatsApp AI bot will work right away for answering generic queries and using Khoj in default mode.
In order to use Khoj on WhatsApp with your own data, you need to setup a Khoj Cloud account and connect your WhatsApp account to it. This is a one time setup and you can do it from the [Khoj Cloud config page](https://app.khoj.dev/settings).
If you hit usage limits for the WhatsApp bot, upgrade to [a paid plan](https://khoj.dev/pricing) on Khoj Cloud.
<img src="https://khoj-web-bucket.s3.amazonaws.com/khojwhatsapp.png" alt="WhatsApp QR Code" width="300" height="300" />
## Features
- **Slash Commands**: Use slash commands to quickly access Khoj features
- `/online`: Get responses from Khoj powered by online search.
- `/dream`: Generate an image in response to your prompt.
- `/notes`: Explicitly force Khoj to retrieve context from your notes. Note: You'll need to connect your WhatsApp account to a Khoj Cloud account for this to work.
We have more commands under development, including `/share` to uploading documents directly to your Khoj account from WhatsApp, and `/speak` in order to get a speech response from Khoj. Feel free to [raise an issue](https://github.com/khoj-ai/flint/issues) if you have any suggestions for new commands.
## Source Code
You can find all of the code for the WhatsApp bot in the the [flint repository](https://github.com/khoj-ai/flint). As all of our code, it is open source and you can contribute to it.

View File

@@ -0,0 +1,8 @@
{
"label": "Contributing",
"position": 2,
"link": {
"type": "generated-index",
"description": "Development Setup"
}
}

View File

@@ -0,0 +1,257 @@
---
sidebar_position: 0
---
# Development
Welcome to the development docs of Khoj! Thanks for you interesting in being a contributor ❤️. Open source contributors are a corner-store of the Khoj community. We welcome all contributions, big or small.
To get started with contributing, check out the official GitHub docs on [contributing to an open-source project](https://docs.github.com/en/get-started/exploring-projects-on-github/contributing-to-a-project).
Join the [Discord](https://discord.gg/WaxF3SkFPU) server and click the ✅ for the question "Are you interested in becoming a contributor?" in the `#welcome-and-rules` channel. This will give you access to the `#contributors` channel where you can ask questions and get help from other contributors.
If you're looking for a place to get started, check out the list of [Github Issues](https://github.com/khoj-ai/khoj/issues) with the tag `good first issue` to find issues that are good for first-time contributors.
## Local Server Installation
### Using Pip
#### 1. Khoj Installation
```mdx-code-block
import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';
```
```mdx-code-block
<Tabs>
<TabItem value="macos" label="MacOS">
```shell
# Get Khoj Code
git clone https://github.com/khoj-ai/khoj && cd khoj
# Create, Activate Virtual Environment
python3 -m venv .venv && source .venv/bin/activate
# For MacOS or zsh users run this
pip install -e '.[dev]'
```
</TabItem>
<TabItem value="win" label="Windows">
```shell
# Get Khoj Code
git clone https://github.com/khoj-ai/khoj && cd khoj
# Create, Activate Virtual Environment
python3 -m venv .venv && .venv\Scripts\activate
# Install Khoj for Development
pip install -e '.[dev]'
```
</TabItem>
<TabItem value="unix" label="Linux">
```shell
# Get Khoj Code
git clone https://github.com/khoj-ai/khoj && cd khoj
# Create, Activate Virtual Environment
python3 -m venv .venv && source .venv/bin/activate
# Install Khoj for Development
pip install -e '.[dev]'
```
</TabItem>
</Tabs>
```
#### 2. Postgres Installation & Setup
Khoj uses the `pgvector` package to store embeddings of your index in a Postgres database. To use this, you need to have Postgres installed.
```mdx-code-block
<Tabs groupId="operating-systems">
<TabItem value="macos" label="MacOS">
Install [Postgres.app](https://postgresapp.com/). This comes pre-installed with `pgvector` and relevant dependencies.
</TabItem>
<TabItem value="win" label="Windows">
1. Use the [recommended installer](https://www.postgresql.org/download/windows/).
2. Follow instructions to [Install PgVector](https://github.com/pgvector/pgvector#windows) in case you need to manually install it. Windows support is experimental for pgvector currently, so we recommend using Docker. Refer to Windows Installation Notes below if there are errors.
</TabItem>
<TabItem value="unix" label="Linux">
From [official instructions](https://wiki.postgresql.org/wiki/Apt)
</TabItem>
<TabItem value="source" label="From Source">
1. Follow instructions to [Install Postgres](https://www.postgresql.org/download/)
2. Follow instructions to [Install PgVector](https://github.com/pgvector/pgvector#installation) in case you need to manually install it.
</TabItem>
</Tabs>
```
##### Create the Khoj database
Make sure to update your environment variables to match your Postgres configuration if you're using a different name. The default values should work for most people. When prompted for a password, you can use the default password `postgres`, or configure it to your preference. Make sure to set the environment variable `POSTGRES_PASSWORD` to the same value as the password you set here.
```mdx-code-block
<Tabs groupId="operating-systems">
<TabItem value="macos" label="MacOS">
```shell
createdb khoj -U postgres --password
```
</TabItem>
<TabItem value="win" label="Windows">
```shell
createdb -U postgres khoj --password
```
</TabItem>
<TabItem value="unix" label="Linux">
```shell
sudo -u postgres createdb khoj --password
```
</TabItem>
</Tabs>
```
#### 3. Run
1. Start Khoj
```bash
khoj -vv
```
2. Configure Khoj
- **Via the Desktop application**: Add files, directories to index using the settings page of your desktop application. Click "Save" to immediately trigger indexing.
Note: Wait after configuration for khoj to Load ML model, generate embeddings and expose API to query notes, images, documents etc specified in config YAML
#### Windows Installation Notes
1. Command `khoj` Not Recognized
- Try reactivating the virtual environment and rerunning the `khoj` command.
- If it still doesn't work repeat the installation process.
2. Python Package Missing
- Use `pip install xxx` and try running the `khoj` command.
3. Command `createdb` Not Recognized
- make sure path to postgres binaries is included in environment variables. It usually looks something like
```
C:\Program Files\PostgreSQL\16\bin
```
4. Connection Refused on port xxxx
- Locate the `pg_hba.conf` file in the location where postgres was installed.
- Edit the file to have **trust** as the method for user postgres, local, and host connections.
- Below is an example:
```
host all postgres 127.0.0.1/32 trust
# "local" is for Unix domain socket connections only
local all all trust
# IPv4 local connections:
host all all 127.0.0.1/32 trust
# IPv6 local connections:
host all all ::1/128 trust
```
4. Errors with installing pgvector
- Reinstall Visual Studio 2022 Build Tools with:
1. desktop development with c++ selected in workloads
2. MSVC (C++ Build Tools), Windows 10/11 SDK, and C++/CLI support for build tools selected in individual components.
- Open the x64 Native Tools Command Prompt as an Administrator
- Follow the pgvector windows installation [instructions](https://github.com/pgvector/pgvector?tab=readme-ov-file#windows) in this command prompt.
### Using Docker
Make sure you install the latest version of [Docker](https://docs.docker.com/get-docker/) and [Docker Compose](https://docs.docker.com/compose/install/).
#### 1. Clone
```shell
git clone https://github.com/khoj-ai/khoj && cd khoj
```
#### 2. Configure
1. Update [docker-compose.yml](https://github.com/khoj-ai/khoj/blob/master/docker-compose.yml) to use relevant environment variables.
2. Comment out the `image` line and uncomment the `build` line in the `server` service
#### 3. Run
This will start the Khoj server, and the database.
```shell
docker-compose up -d
```
#### 4. Upgrade
If you've made changes to the codebase, you'll need to rebuild the Docker image before running the container again.
```shell
docker-compose build --no-cache
```
## Update clients
In whichever clients you're using for testing, you'll need to update the server URL to point to your local server. By default, the local server URL should be `http://127.0.0.1:42110`.
## Validate
### Before Making Changes
1. Install Git Hooks for Validation
```shell
./scripts/dev_setup.sh
```
- This ensures standard code formatting fixes and other checks run automatically on every commit and push
- Note 1: If [pre-commit](https://pre-commit.com/#intro) didn't already get installed, [install it](https://pre-commit.com/#install) via `pip install pre-commit`
- Note 2: To run the pre-commit changes manually, use `pre-commit run --hook-stage manual --all` before creating PR
### Before Creating PR
:::tip[Note]
You should be in an active virtual environment for Khoj in order to run the unit tests and linter.
:::
1. Ensure that you have a [Github Issue](https://github.com/khoj-ai/khoj/issues) that can be linked to the PR. If not, create one. Make sure you've tagged one of the maintainers to the issue. This will ensure that the maintainers are notified of the PR and can review it. It's best discuss the code design on an existing issue or Discord thread before creating a PR. This helps get your PR merged faster.
1. Run unit tests.
```shell
pytest
```
2. Run the linter.
```shell
mypy
```
4. Think about how to add unit tests to verify the functionality you're adding in the PR. If you're not sure how to do this, ask for help in the Github issue or on Discord's `#contributors` channel.
### After Creating PR
1. Automated [validation workflows](https://github.com/khoj-ai/khoj/tree/master/.github/workflows) should run for every PR. Tag one of the maintainers in the PR to trigger it.
## Obsidian Plugin Development
### Plugin development setup
The core code for the Obsidian plugin is under `src/interface/obsidian`. The file `main.ts` is a good place to start.
1. In your CLI, go to the directory `src/interface/obsidian` in the Khoj repository.
2. Run `yarn install` to install the dependencies.
3. Run `yarn dev` to start the development server. This will continually rebuild the plugin as you make changes to the code.
- Your code changes will be outputted to a file called `main.js` in the `obsidian` directory.
### Loading your development plugin in Obsidian
1. Make sure you have the Khoj plugin installed in Obsidian. [See the plugin page](https://publish.obsidian.md/hub/02+-+Community+Expansions/02.05+All+Community+Expansions/Plugins/khoj).
1. Open Obsidian and go to your settings (gear icon in the bottom left corner)
2. Click on 'Community Plugins' in the left panel
3. Next to the 'Installed Plugins' heading, click on the folder icon to open the folder with the plugin's source code.
4. Open the `khoj` folder in the file explorer that opens. You'll see a file called `main.js` in this folder. To test your changes, replace this file with the `main.js` file that was generated by the development server in the previous section.
## Create Khoj Release (Only for Maintainers)
Follow the steps below to [release](https://github.com/debanjum/khoj/releases/) Khoj. This will create a stable release of Khoj on [Pypi](https://pypi.org/project/khoj/), [Melpa](https://stable.melpa.org/#%252Fkhoj) and [Obsidian](https://obsidian.md/plugins?id%253Dkhoj). It will also create desktop apps of Khoj and attach them to the latest release.
1. Create and tag release commit by running the bump_version script. The release commit sets version number in required metadata files.
```shell
./scripts/bump_version.sh -c "<release_version>"
```
2. Push commit and then the tag to trigger the release workflow to create Release with auto generated release notes.
```shell
git push origin master # push release commit to khoj repository
git push origin <release_version> # push release tag to khoj repository
```
3. [Optional] Update the Release Notes to highlight new features, fixes and updates
## Architecture
![](/img/khoj_architecture.png)
## Visualize Codebase
*[Interactive Visualization](https://mango-dune-07a8b7110.1.azurestaticapps.net/?repo=debanjum%2Fkhoj)*
![](/img/khoj_codebase_visualization_0.2.1.png)
## Visualize Khoj Obsidian Plugin Codebase
![](/img/khoj_obsidian_codebase_visualization_0.2.1.png)

View File

@@ -0,0 +1,8 @@
{
"label": "Data Sources",
"position": 5,
"link": {
"type": "generated-index",
"description": "Online data sources for indexing via Khoj"
}
}

View File

@@ -1,14 +1,14 @@
# Setup the Github integration
# Github integration
The Github integration allows you to index as many repositories as you want. It's currently default configured to index Issues, Commits, and all Markdown/Org files in each repository. For large repositories, this takes a fairly long time, but it works well for smaller projects.
# Configure your settings
1. Go to [http://localhost:42110/config](http://localhost:42110/config) and enter in settings for the data sources you want to index. You'll have to specify the file paths.
1. Go to [https://app.khoj.dev/settings](https://app.khoj.dev/settings) and enter in settings for the data sources you want to index. You'll have to specify the file paths.
## Use the Github plugin
1. Generate a [classic PAT (personal access token)](https://docs.github.com/en/authentication/keeping-your-account-and-data-secure/managing-your-personal-access-tokens) from [Github](https://github.com/settings/tokens) with `repo` and `admin:org` scopes at least.
2. Navigate to [http://localhost:42110/config/content_type/github](http://localhost:42110/config/content_type/github) to configure your Github settings. Enter in your PAT, along with details for each repository you want to index.
2. Navigate to [https://app.khoj.dev/settings#github](https://app.khoj.dev/settings#github) to configure your Github settings. Enter in your PAT, along with details for each repository you want to index.
3. Click `Save`. Go back to the settings page and click `Configure`.
4. Go to [http://localhost:42110/](http://localhost:42110/) and start searching!
4. Go to [https://app.khoj.dev/](https://app.khoj.dev/) and start searching!

View File

@@ -0,0 +1,19 @@
# Notion Integration
The Notion integration allows you to search/chat with your Notion workspaces. [Notion](https://notion.so/) is a platform people use for taking notes, especially for collaboration.
Go to https://app.khoj.dev/settings to connect your Notion workspace(s) to Khoj.
![notion_integration](https://assets.khoj.dev/notion_integration.gif)
## Self-Hosted Setup
1. Go to https://www.notion.so/my-integrations and create a new integration called Khoj to get an API key.
![setup_new_integration](https://github.com/khoj-ai/khoj/assets/65192171/b056e057-d4dc-47dc-aad3-57b59a22c68b)
3. Share all the workspaces that you want to integrate with the Khoj integration you just made in the previous step
![enable_workspace](https://github.com/khoj-ai/khoj/assets/65192171/98290303-b5b8-4cb0-b32c-f68c6923a3d0)
4. In the first step, you generated an API key. Use the newly generated API Key in your Khoj settings, by default at [http://localhost:42110/settings#notion](http://localhost:42110/settings#notion). Click `Save`.
5. Click `Configure` in http://localhost:42110/settings to index your Notion workspace(s).
That's it! You should be ready to start searching and chatting. Make sure you've configured your [chat settings](/get-started/setup#2-configure).

View File

@@ -0,0 +1,16 @@
---
sidebar_position: 0
keywords: ["upload data", "upload files", "share data", "share files", "pdf ai", "ai for pdf", "ai for documents", "ai for files", "local ai pdf", "local ai documents", "local ai files"]
---
# Upload your data
There are several ways you can get started with sharing your data with the Khoj AI.
- Drag and drop your documents via [the web UI](/clients/web/#upload-documents). This is best if you have a one-off document you need to interact with.
- Use the desktop app to [upload and sync your documents](/clients/desktop). This is best if you have a lot of documents on your computer or you need the docs to stay in sync.
- Setup the sync options for either [Obsidian](/clients/obsidian) or [Emacs](/clients/emacs) to automatically sync your documents with Khoj. This is best if you are already using these tools and want to leverage Khoj's AI capabilities.
- Configure your [Notion](/data-sources/notion_integration) or [Github](/data-sources/github_integration) to sync with Khoj. By providing your credentials, you can keep the data synced in the background.
![demo of dragging and dropping a file](https://assets.khoj.dev/drag_drop_file.gif)

View File

@@ -0,0 +1,8 @@
{
"label": "Features",
"position": 3,
"link": {
"type": "generated-index",
"description": "Features supported by Khoj"
}
}

View File

@@ -0,0 +1,15 @@
---
sidebar_position: 4
---
# Agents
You can use agents to setup custom system prompts with Khoj. The server host can setup their own agents, which are accessible to all users. You can see ours at https://app.khoj.dev/agents.
![Demo](https://assets.khoj.dev/agents_demo.gif)
## Creating an Agent (Self-Hosted)
Go to `server/admin/database/agent` on your server and click `Add Agent` to create a new one. You have to set it to `public` in order for it to be accessible to all the users on your server. To limit access to a specific user, do not set the `public` flag and add the user in the `Creator` field.
Set your custom prompt in the `personality` field.

View File

@@ -0,0 +1,35 @@
---
sidebar_position: 1
---
# Overview
Khoj supports a variety of features, including search and chat with a wide range of data sources and interfaces.
#### [Search](/features/search)
- **Local**: Your personal data stays local. All search and indexing is done on your machine when you [self-host](/get-started/setup)
- **Incremental**: Incremental search for a fast, search-as-you-type experience
#### [Chat](/features/chat)
- **Faster answers**: Find answers faster, smoother than search. No need to manually scan through your notes to find answers.
- **Iterative discovery**: Iteratively explore and (re-)discover your notes
- **Assisted creativity**: Smoothly weave across answers retrieval and content generation
- **Works online or offline**: Chat using online or offline AI chat models
#### General
- **Cloud or Self-Host**: Use [cloud](https://app.khoj.dev/login) to use Khoj anytime from anywhere or [self-host](/get-started/setup) for privacy
- **Natural**: Advanced natural language understanding using Transformer based ML Models
- **Pluggable**: Modular architecture makes it easy to plug in new data sources, frontends and ML models
- **Multiple Sources**: Index your Org-mode, Markdown, PDF, plaintext files, Github repos and Notion pages
- **Multiple Interfaces**: Interact from your Web Browser, Emacs, Obsidian, Desktop app or even Whatsapp
### Supported Interfaces
Khoj is available as a [Desktop app](/clients/desktop), [Emacs package](/clients/emacs), [Obsidian plugin](/clients/obsidian), [Web app](/clients/web) and a [Whatsapp AI](https://khoj.dev/whatsapp).
![](/img/khoj_clients.svg ':size=400px')
### Supported Data Sources
Khoj can understand your word, PDF, org-mode, markdown, plaintext files, [Github projects](/data-sources/github_integration) and [Notion pages](/data-sources/notion_integration).
![](/img/khoj_datasources.svg ':size=200px')

View File

@@ -0,0 +1,9 @@
# Automations
[Automations](https://app.khoj.dev/automations) are a powerful feature within Khoj to schedule repeated tasks for information retrieval directly from your account. You can run them at a specific time and interval. This is still an experimental feature, so please report any issues you encounter.
Khoj will use your local time zone to determine the scheduling localization. You can go back and configure the prompt any time you want from the automations page. You can also delete the automation if you no longer need it.
:::danger[Note]
Automations will not deliver emails to self-hosted users out of the box. You'll have to have Resend and [Authentication](/advanced/authentication) setup to send emails.
:::

View File

@@ -0,0 +1,71 @@
---
sidebar_position: 2
---
# Chat
You can configure Khoj to chat with you about anything. When relevant, it'll use any notes or documents you shared with it to respond. It acts as an excellent research assistant, search engine, or personal tutor.
<img src="/img/khoj_chat_on_web.png" alt="Chat on Web" style={{width: '400px'}}/>
### Overview
- Creates a personal assistant for you to inquire and engage with your notes or online information as needed
- You can choose to use Online or Offline Chat depending on your requirements
- Supports multi-turn conversations with the relevant notes for context
- Shows reference notes used to generate a response
### Setup (Self-Hosting)
#### Offline Chat
Offline chat stays completely private and can work without internet using open-source models.
> **System Requirements**:
> - Minimum 8 GB RAM. Recommend **16Gb VRAM**
> - Minimum **5 GB of Disk** available
> - A CPU supporting [AVX or AVX2 instructions](https://en.wikipedia.org/wiki/Advanced_Vector_Extensions) is required
> - An Nvidia, AMD GPU or a Mac M1+ machine would significantly speed up chat response times
1. Open your [Khoj offline settings](http://localhost:42110/server/admin/database/offlinechatprocessorconversationconfig/) and click *Enable* on the Offline Chat configuration.
2. Open your [Chat model options settings](http://localhost:42110/server/admin/database/chatmodeloptions/) and add any [GGUF chat model](https://huggingface.co/models?library=gguf) to use for offline chat. Make sure to use `Offline` as its type. For a balanced chat model that runs well on standard consumer hardware we recommend using [Llama 3.1 by Meta](https://huggingface.co/bartowski/Meta-Llama-3.1-8B-Instruct-GGUF) by default. For machines with no or small GPU we recommend using [Gemma 2 2B](https://huggingface.co/bartowski/gemma-2-2b-it-GGUF) or [Phi 3.5 mini](https://huggingface.co/bartowski/Phi-3.5-mini-instruct-GGUF)
:::tip[Note]
Offline chat is not supported for a multi-user scenario. The host machine will encounter segmentation faults if multiple users try to use offline chat at the same time.
:::
#### Online Chat
Online chat requires internet to use ChatGPT but is faster, higher quality and less compute intensive.
:::danger[Warning]
This will enable Khoj to send your chat queries and query relevant notes to OpenAI for processing.
:::
1. Get your [OpenAI API Key](https://platform.openai.com/account/api-keys)
2. Open your [Khoj Online Chat settings](http://localhost:42110/server/admin/database/openaiprocessorconversationconfig/). Add a new setting with your OpenAI API key, and click *Save*. Only one configuration will be used, so make sure that's the only one you have.
3. Open your [Chat model options](http://localhost:42110/server/admin/database/chatmodeloptions/) and add a new option for the OpenAI chat model you want to use. Make sure to use `OpenAI` as its type.
### Use
1. Open Khoj Chat
- **On Web**: Open [/chat](https://app.khoj.dev/chat) in your web browser
- **On Obsidian**: Search for *Khoj: Chat* in the [Command Palette](https://help.obsidian.md/Plugins/Command+palette)
- **On Emacs**: Run `M-x khoj <user-query>`
2. Enter your queries to chat with Khoj. Use [slash commands](#commands) and [query filters](/miscellaneous/advanced#query-filters) to change what Khoj uses to respond
#### Details
1. Your query is used to retrieve the most relevant notes, if any, using Khoj search
2. These notes, the last few messages and associated metadata is passed to the enabled chat model along with your query to generate a response
#### Conversation File Filters
You can use conversation file filters to limit the notes used in the chat response. To do so, use the left panel in the web UI. Alternatively, you can also use [query filters](/miscellaneous/advanced#query-filters) to limit the notes used in the chat response.
<img src="/img/select_file_filter.png" alt="Conversation File Filter" style={{width: '400px'}}/>
#### Commands
Slash commands allows you to change what Khoj uses to respond to your query
- **/notes**: Limit chat to only respond using your notes, not just Khoj's general world knowledge as reference
- **/general**: Limit chat to only respond using Khoj's general world knowledge, not using your notes as reference
- **/default**: Allow chat to respond using your notes or it's general knowledge as reference. It's the default behavior when no slash command is used
- **/online**: Use online information and incorporate it in the prompt to the LLM to send you a response.
- **/image**: Generate an image in response to your query.
- **/help**: Use /help to get all available commands and general information about Khoj
- **/summarize**: Can be used to summarize 1 selected file filter for that conversation. Refer to [File Summarization](summarization) for details.

View File

@@ -0,0 +1,15 @@
# Image Generation
You can use Khoj to generate images from text prompts. You can get deeper into the details of our image generation flow in this blog post: https://blog.khoj.dev/posts/how-khoj-generates-images/.
To generate images, you just need to provide a prompt to Khoj in which the image generation is in the instructions. Khoj will automatically detect the image generation intent, augment your generation prompt, and then create the image. Here are some examples:
| Prompt | Image |
| --- | --- |
| Paint a picture of the plants I got last month, pixar-animation | ![plants](/img/plants_i_got.png) |
| Create a picture of my dream house, based on my interests | ![house](/img/dream_house.png) |
## Setup (Self-Hosting)
Right now, we only support integration with OpenAI's DALL-E. You need to have an OpenAI API key to use this feature. Here's how you can set it up:
1. Setup your OpenAI API key. See instructions [here](/get-started/setup#2-configure)
2. Create a text to image config at http://localhost:42110/server/admin/database/texttoimagemodelconfig/. We recommend the value `dall-e-3`.

View File

@@ -0,0 +1,18 @@
# Keyboard Shortcuts
Oftentimes, having to leave your keyboard to move the mouse can break your flow. We want to make it as easy as possible for AI to flow with you as you are, so we've added some keyboard shortcuts to facilitate that.
## Web App
### Up/Down Arrow Keys (Chat Input)
- **Up Arrow Key**: Move up in the list of most recent messages in your chat window.
- **Down Arrow Key**: Move down in the list of most recent messages in your chat window.
You can watch the demo to see how it works on [this sample conversation](http://app.khoj.dev/share/chat/in-particular-assess-the-prospect-for-brazil-/).
<img src="https://assets.khoj.dev/up_down_shortcuts.gif" height="300" alt="Up/Down Arrow Keys"></img>
### Enter (Chat Input)
Press 'Enter' to send the message you've typed in the chat window.

View File

@@ -0,0 +1,9 @@
# Desktop Quick Chat (Khoj Mini)
Once you have the Khoj [desktop application](https://khoj.dev/downloads) installed, you can use the desktop shortcut to quickly pull up a mini chat module for quicker answers. See the desktop setup instructions [in the docs](/clients/desktop.md) for more information.
To use it, you just have to copy the text you want to inject into your query, and then run `Ctrl + Shift + K` (or `Cmd + Shift + K` on Mac) to open the mini chat module. The text you copied will be automatically pasted into the chat module, and you can then hit enter to get the answer. You can edit the text before hitting enter if you want to refine your query.
The desktop shortcut is a great way to quickly get answers to your questions without having to switch between windows or tabs. It's especially useful when you're working on a project and need to quickly look up something without losing your focus.
![Desktop Shortcut](https://assets.khoj.dev/courseload_decision_dekstop.gif)

Some files were not shown because too many files have changed in this diff Show More