release: v0.1.5 web_fetch direct links

This commit is contained in:
ilya-bov
2026-03-23 15:25:52 +03:00
parent 61a43dc45a
commit 5d3b1218ab
14 changed files with 308 additions and 10 deletions

View File

@@ -2,6 +2,21 @@
All notable changes to this project will be documented in this file.
## [0.1.5] - 2026-03-23
### Added
- New `web_fetch` tool for direct URL reading and page extraction.
- New prompt guide `tool-web_fetch.md` for link-specific workflows.
### Changed
- `search_web` is now explicitly discovery-oriented; direct links should use `web_fetch`.
- Chat tool output UI now shows `Web Fetch` calls with the target URL.
- Request lifecycle docs updated with `web_fetch` in tool catalog.
### Fixed
- Direct link requests no longer degrade into generic search queries.
- Health endpoint version updated to `0.1.5`.
## [0.1.4] - 2026-03-23
### Added

View File

@@ -22,8 +22,8 @@ The app runs as a Next.js service and stores runtime state on disk (`./data`).
## Releases
- Latest release snapshot: [0.1.4 - Web Search Autostart](./docs/releases/0.1.4-web-search-autostart.md)
- GitHub release body : [v0.1.4](./docs/releases/github-v0.1.4.md)
- Latest release snapshot: [0.1.5 - Web Fetch for Direct Links](./docs/releases/0.1.5-web-fetch-direct-links.md)
- GitHub release body : [v0.1.5](./docs/releases/github-v0.1.5.md)
- Release archive: [docs/releases/README.md](./docs/releases/README.md)
## Contributing and Support

View File

@@ -0,0 +1,42 @@
# Eggent 0.1.5 - Web Fetch for Direct Links
Date: 2026-03-23
Type: Patch release snapshot
## Release Name
`Web Fetch for Direct Links`
This release adds a dedicated `web_fetch` tool so agents can open and read a specific URL directly, instead of treating links as generic search queries.
## What Is Included
### 1) New `web_fetch` Tool
- Added a dedicated tool for fetching direct `http(s)` URLs.
- Supports redirected URLs and returns readable page content.
- Extracts useful text from HTML pages, plus supports JSON/text responses.
### 2) URL Fetch Reliability Guards
- Added URL normalization and validation for direct-link requests.
- Added timeout and response-size limits to keep tool execution stable.
- Added content trimming for large pages to keep results manageable for the model.
### 3) Tooling and Prompt Separation
- `search_web` remains discovery-focused for broad web lookup.
- Added explicit prompt guidance so direct links use `web_fetch`.
- Added `tool-web_fetch.md` with usage rules for link-based tasks.
### 4) Chat UI and Docs Updates
- Tool output panel now renders `Web Fetch` with target URL context.
- Request lifecycle docs now include `web_fetch` in the tool catalog.
## New in 0.1.5
- Direct link reading with a first-class `web_fetch` tool.
- Cleaner separation: `search_web` for search, `web_fetch` for specific pages.
- Package/app health version bumped to `0.1.5`.
## Upgrade Notes
- No migration is required.
- Existing workflows continue to work.
- For link-specific tasks, use `web_fetch` instead of `search_web`.

View File

@@ -4,6 +4,7 @@ This directory contains release summaries and publish-ready notes.
| Version | Name | Date | Notes |
| --- | --- | --- | --- |
| `0.1.5` | Web Fetch for Direct Links | 2026-03-23 | [Full snapshot](./0.1.5-web-fetch-direct-links.md), [GitHub body](./github-v0.1.5.md) |
| `0.1.4` | Web Search Autostart | 2026-03-23 | [Full snapshot](./0.1.4-web-search-autostart.md), [GitHub body](./github-v0.1.4.md) |
| `0.1.3` | OAuth Native CLI Providers | 2026-03-06 | [Full snapshot](./0.1.3-oauth-native-cli-providers.md), [GitHub body](./github-v0.1.3.md) |
| `0.1.2` | Dark Theme and Python Recovery | 2026-03-06 | [Full snapshot](./0.1.2-dark-theme-python-recovery.md), [GitHub body](./github-v0.1.2.md) |

View File

@@ -0,0 +1,23 @@
## Eggent v0.1.5 - Web Fetch for Direct Links
Patch release focused on direct-link handling via a dedicated web fetch tool.
### Highlights
- Added new `web_fetch` tool for opening and reading specific URLs.
- Added HTML-to-text extraction, JSON/text handling, timeout, and response-size limits for stable fetch behavior.
- Kept `search_web` focused on discovery; direct links now use `web_fetch`.
- Updated chat tool output UI with `Web Fetch` label and target URL preview.
- Updated request-flow documentation and tool prompts for the new split.
- Version bump to `0.1.5` across package metadata and `GET /api/health`.
### Upgrade Notes
- No migration required.
- Existing search behavior is preserved.
- For URL-specific tasks, call `web_fetch` directly.
### Links
- Full release snapshot: `docs/releases/0.1.5-web-fetch-direct-links.md`
- Installation and update guide: `README.md`

View File

@@ -46,6 +46,7 @@ A tool set is created depending on context and settings:
| `memory_delete` | If memory is enabled | Delete memory records |
| `knowledge_query` | Always | Search knowledge base documents |
| `search_web` | If web search is enabled | Search the internet |
| `web_fetch` | If web tools are enabled | Fetch a specific URL |
| `load_skill` | If `projectId` exists | Load full skill instructions |
| `call_subordinate` | For agents 0-2 only | Delegate to a subordinate agent |
@@ -159,7 +160,7 @@ When **`load_skill`** is called, the tool reads the selected skill's full **SKIL
[agent.ts] runAgent:
1. getSettings() -> model, settings
2. getChat(chatId) -> context.history
3. createAgentTools(context, settings) -> tools (response, code_execution, memory_*, knowledge_query, search_web?, load_skill?, call_subordinate?)
3. createAgentTools(context, settings) -> tools (response, code_execution, memory_*, knowledge_query, search_web?, web_fetch?, load_skill?, call_subordinate?)
4. buildSystemPrompt(projectId, agentNumber, toolNames) ->
system.md + Agent Identity + tool-*.md per tool + Active Project + project.instructions + loadProjectSkillsMetadata -> <available_skills> + date/time
5. messages = history + { user, userMessage }

4
package-lock.json generated
View File

@@ -1,12 +1,12 @@
{
"name": "design-vibe",
"version": "0.1.4",
"version": "0.1.5",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"name": "design-vibe",
"version": "0.1.4",
"version": "0.1.5",
"dependencies": {
"@ai-sdk/anthropic": "^3.0.37",
"@ai-sdk/google": "^3.0.21",

View File

@@ -1,6 +1,6 @@
{
"name": "design-vibe",
"version": "0.1.4",
"version": "0.1.5",
"private": true,
"scripts": {
"dev": "next dev",

View File

@@ -2,6 +2,6 @@ export async function GET() {
return Response.json({
status: "ok",
timestamp: new Date().toISOString(),
version: "0.1.4",
version: "0.1.5",
});
}

View File

@@ -7,6 +7,7 @@ import {
Terminal,
Brain,
Search,
Globe,
FileText,
Bot,
Puzzle,
@@ -27,6 +28,7 @@ const TOOL_ICONS: Record<string, React.ElementType> = {
memory_load: Brain,
memory_delete: Brain,
search_web: Search,
web_fetch: Globe,
knowledge_query: FileText,
call_subordinate: Bot,
load_skill: Puzzle,
@@ -51,6 +53,7 @@ const TOOL_LABELS: Record<string, string> = {
memory_load: "Memory Load",
memory_delete: "Memory Delete",
search_web: "Web Search",
web_fetch: "Web Fetch",
knowledge_query: "Knowledge Query",
call_subordinate: "Subordinate Agent",
load_skill: "Load Skill",
@@ -101,6 +104,11 @@ export function ToolOutput({ toolName, args, result }: ToolOutputProps) {
&quot;{String(args.query)}&quot;
</span>
) : null}
{toolName === "web_fetch" && args.url ? (
<span className="text-xs text-muted-foreground truncate">
{String(args.url)}
</span>
) : null}
</button>
{expanded && (

View File

@@ -9,6 +9,69 @@ interface SearchResult {
const MAX_RESULTS = 10;
const DDG_HTML_ENDPOINT = "https://html.duckduckgo.com/html";
const DDG_INSTANT_ENDPOINT = "https://api.duckduckgo.com/";
const WEB_FETCH_TIMEOUT_MS = 20000;
const WEB_FETCH_MAX_BYTES = 1_500_000;
const WEB_FETCH_MAX_CHARS = 12000;
export async function fetchWebPage(rawUrl: string): Promise<string> {
const url = normalizeFetchUrl(rawUrl);
const abortController = new AbortController();
const timeout = setTimeout(() => abortController.abort(), WEB_FETCH_TIMEOUT_MS);
try {
const response = await fetch(url.toString(), {
method: "GET",
redirect: "follow",
signal: abortController.signal,
headers: {
Accept:
"text/html,application/xhtml+xml,application/xml;q=0.9,text/plain;q=0.8,application/json;q=0.7,*/*;q=0.5",
"User-Agent":
"Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36",
},
});
if (!response.ok) {
throw new Error(`HTTP ${response.status} ${response.statusText}`);
}
const contentType = (response.headers.get("content-type") || "").toLowerCase();
const finalUrl = response.url || url.toString();
const rawBody = await readResponseBodyLimited(response, WEB_FETCH_MAX_BYTES);
const parsed = parseFetchedBody(rawBody, contentType);
const content = parsed.content.trim();
const trimmed = content.slice(0, WEB_FETCH_MAX_CHARS);
const wasTrimmed = content.length > WEB_FETCH_MAX_CHARS;
if (!trimmed) {
return `Fetched URL: ${finalUrl}\nContent-Type: ${contentType || "unknown"}\nNo readable text content found.`;
}
const lines: string[] = [
`Fetched URL: ${finalUrl}`,
`Content-Type: ${contentType || "unknown"}`,
];
if (parsed.title) {
lines.push(`Title: ${parsed.title}`);
}
lines.push("");
lines.push(trimmed);
if (wasTrimmed) {
lines.push("");
lines.push(`[truncated to ${WEB_FETCH_MAX_CHARS} chars]`);
}
return lines.join("\n");
} catch (error) {
if (error instanceof Error && error.name === "AbortError") {
return `Web fetch error: timed out after ${Math.round(WEB_FETCH_TIMEOUT_MS / 1000)} seconds`;
}
return `Web fetch error: ${error instanceof Error ? error.message : String(error)}`;
} finally {
clearTimeout(timeout);
}
}
/**
* Search the web using configured provider
@@ -168,6 +231,119 @@ function stripHtml(text: string): string {
.trim();
}
function normalizeFetchUrl(raw: string): URL {
const input = raw.trim();
if (!input) {
throw new Error("URL is required.");
}
let normalized = input;
if (!/^[a-z][a-z\d+\-.]*:\/\//i.test(normalized)) {
if (/^(www\.)/i.test(normalized) || /^[a-z0-9.-]+\.[a-z]{2,}(?:[/:?#]|$)/i.test(normalized)) {
normalized = `https://${normalized}`;
} else {
throw new Error("Invalid URL. Expected an absolute http(s) URL.");
}
}
let url: URL;
try {
url = new URL(normalized);
} catch {
throw new Error("Invalid URL format.");
}
if (url.protocol !== "http:" && url.protocol !== "https:") {
throw new Error("Only http(s) URLs are supported.");
}
return url;
}
async function readResponseBodyLimited(response: Response, maxBytes: number): Promise<string> {
const reader = response.body?.getReader();
if (!reader) {
return await response.text();
}
const decoder = new TextDecoder();
let total = 0;
let text = "";
while (true) {
const { done, value } = await reader.read();
if (done) break;
if (!value) continue;
total += value.byteLength;
if (total > maxBytes) {
await reader.cancel();
throw new Error(`Response too large. Limit: ${maxBytes} bytes.`);
}
text += decoder.decode(value, { stream: true });
}
text += decoder.decode();
return text;
}
function parseFetchedBody(
body: string,
contentType: string
): { title?: string; content: string } {
if (contentType.includes("application/json")) {
try {
const parsed = JSON.parse(body) as unknown;
return { content: JSON.stringify(parsed, null, 2) };
} catch {
return { content: body };
}
}
if (contentType.includes("text/html") || looksLikeHtml(body)) {
const titleMatch = /<title[^>]*>([\s\S]*?)<\/title>/i.exec(body);
const title = titleMatch ? normalizeFetchedText(stripHtml(decodeHtmlEntities(titleMatch[1]))) : "";
return {
title: title || undefined,
content: htmlToText(body),
};
}
return { content: normalizeFetchedText(body) };
}
function looksLikeHtml(body: string): boolean {
const sample = body.slice(0, 1000).toLowerCase();
return sample.includes("<html") || sample.includes("<body") || sample.includes("<!doctype html");
}
function htmlToText(html: string): string {
const cleaned = html
.replace(/<!--[\s\S]*?-->/g, " ")
.replace(/<script[\s\S]*?<\/script>/gi, " ")
.replace(/<style[\s\S]*?<\/style>/gi, " ")
.replace(/<noscript[\s\S]*?<\/noscript>/gi, " ")
.replace(/<svg[\s\S]*?<\/svg>/gi, " ")
.replace(/<template[\s\S]*?<\/template>/gi, " ");
const withBreaks = cleaned.replace(
/<\/?(h[1-6]|p|div|section|article|header|footer|main|aside|nav|li|ul|ol|table|tr|td|th|blockquote|pre|br)[^>]*>/gi,
"\n"
);
return normalizeFetchedText(decodeHtmlEntities(stripHtml(withBreaks)));
}
function normalizeFetchedText(text: string): string {
return text
.replace(/\r/g, "")
.replace(/[ \t]+\n/g, "\n")
.replace(/\n{3,}/g, "\n\n")
.replace(/[ \t]{2,}/g, " ")
.trim();
}
function decodeDuckDuckGoUrl(rawUrl: string): string {
try {
const parsed = new URL(

View File

@@ -17,7 +17,7 @@ import {
} from "@/lib/tools/code-execution";
import { memorySave, memoryLoad, memoryDelete } from "@/lib/tools/memory-tools";
import { knowledgeQuery } from "@/lib/tools/knowledge-query";
import { searchWeb } from "@/lib/tools/search-engine";
import { fetchWebPage, searchWeb } from "@/lib/tools/search-engine";
import { callSubordinate } from "@/lib/tools/call-subordinate";
import { createCronTool } from "@/lib/tools/cron-tool";
import { installPackages } from "@/lib/tools/install-orchestrator";
@@ -1271,11 +1271,11 @@ export function createAgentTools(
if (settings.search.enabled && settings.search.provider !== "none") {
tools.search_web = tool({
description:
"Search the internet for current information. Use this when you need up-to-date information, facts you're unsure about, or any web-based research.",
"Search the internet for current information. Use this for broad discovery and multiple sources. For a specific URL, use web_fetch.",
inputSchema: z.object({
query: z
.string()
.describe("The search query"),
.describe("The search query (not a direct URL)"),
limit: z
.number()
.default(5)
@@ -1287,6 +1287,21 @@ export function createAgentTools(
});
}
if (settings.search.enabled) {
tools.web_fetch = tool({
description:
"Fetch and read content from a specific web page URL. Use this when the user gives a direct link.",
inputSchema: z.object({
url: z
.string()
.describe("Absolute http(s) URL to fetch, for example https://example.com/article"),
}),
execute: async ({ url }) => {
return fetchWebPage(url);
},
});
}
const telegramRuntime = getTelegramRuntimeData(context);
if (telegramRuntime) {
tools.telegram_send_file = tool({

View File

@@ -13,6 +13,7 @@ Search the internet for current information.
## Best Practices
- Use specific, targeted search queries
- Do not pass raw URLs here; use `web_fetch` for direct link reading
- For technical queries, include technology names and versions
- Review multiple results before drawing conclusions
- Cite sources when presenting information from search results

View File

@@ -0,0 +1,16 @@
# Web Fetch Tool
Fetch a specific web page by URL and return readable page content.
## When to Use
- The user provides a direct link and asks to read/summarize it
- You need content from one known page, not broad discovery
- You must verify details from a specific source URL
## Best Practices
- Pass a full `http(s)` URL
- Prefer `web_fetch` for direct links, `search_web` for discovery
- If fetch fails, explain the error and ask for another link if needed
- Quote or summarize only the relevant sections in your final response