From 83f0cc379df6ebe5143df6ec1402cd38df3bb639 Mon Sep 17 00:00:00 2001 From: larchanka Date: Mon, 23 Mar 2026 18:27:27 +0100 Subject: [PATCH] Update research skill --- skills/research/SKILL.md | 283 ++++++++++++++++++++++++++++++++------- 1 file changed, 231 insertions(+), 52 deletions(-) diff --git a/skills/research/SKILL.md b/skills/research/SKILL.md index 8abf3f1..84a379f 100644 --- a/skills/research/SKILL.md +++ b/skills/research/SKILL.md @@ -1,87 +1,266 @@ -Description: PRIMARY SEARCH. Use for deep web research, fact-checking, news gathering, or topical deep dives via the lynx tool. This is the default skill for any query requiring external or up-to-date information. +Description: PRIMARY SEARCH. Use for deep research, fact-checking, or when external/up-to-date information is required. The agent MUST decide the best tool: use existing tools when sufficient, or escalate to lynx for deep, recursive, multi-step research. The agent is expected to pursue **maximum completeness** and continue researching until the task is fully resolved. # Research Skill -Deep web research using the text-based browser `lynx` and DuckDuckGo HTML interface. +Flexible, **depth-first web research system** using: +- Existing tools (search APIs, structured tools, integrations) +- OR the text-based browser `lynx` for deep/manual exploration -## When to Use +--- -**USE this skill when:** +## Core Principle -- You need current information from the web. -- You need to deep dive into a topic by following multiple links. -- Fact-checking or verifying news. -- Researching technical documentation or complex subjects. +**DO NOT default to lynx.** +**DO NOT stop early.** -**DON'T use this skill when:** +You must: +1. Choose the simplest tool that can work +2. Escalate if needed +3. Continue researching until: + - The question is fully answered + - No major unknowns remain + - Additional search provides diminishing returns -- The information is already available in your training data (unless it's time-sensitive). -- You can solve the task with a simple search and the snippet is enough. +--- -## Strategy +## Termination Rule (CRITICAL) -Web research is a multi-step process. Do not stop at the first page of search results. +You MUST continue research in a loop until ALL conditions are met: -1. **Planning**: Start with a plan of what you want to research. Plan must be very detailed and logical. Build list of search queries to execute. They have to be logical and cover different aspects of the topic. -2. **Search**: Execute search queries one by one using `lynx -dump` command. -3. **Analyze**: Scan the search results after each query. Identify 2-10 most promising links. If needed go to the next page of search results. -4. **Browse**: Visit those links one by one. -5. **Data Analysis**: Analyze the data you have collected. Identify key findings and insights. Find missing information and plan additional search queries. Repeat this step until you have enough information to fulfill the user's request. -6. **Recurse**: If a page contains a "References" section or promising links to deeper info, follow them. -7. **Summarize**: Gather all key findings, insights, valuable details and provide a comprehensive response. +- ✅ The user’s question is fully answered +- ✅ Key subtopics are covered +- ✅ Conflicting information (if any) is resolved +- ✅ You are confident no major gaps remain -**!!! IMPORTANT !!!** If you think that you dont have enough information to answer the user's request, you can generate new search queries and repeat the process. +If NOT: +→ Generate new queries and continue -## Commands +This creates an **effectively unbounded research loop** (within system limits). -### 1. Perform a Search +--- -Use DuckDuckGo's HTML or Lite interface with a standard User-Agent to avoid blocks. +## Tool Selection Logic (CRITICAL) + +### 1. Use EXISTING TOOLS when: + +- Answer can be obtained in **1–2 steps** +- Data is already structured +- Snippets are sufficient + +--- + +### 2. Use LYNX when: + +- You need **multi-step exploration** +- You must open and analyze pages +- Information is fragmented +- You need comparisons or synthesis +- Existing tools are insufficient + +--- + +### 3. Hybrid Strategy (RECOMMENDED) + +1. Start with existing tools → fast overview +2. Identify gaps +3. Use lynx → deep dive +4. Repeat until complete + +--- + +## Deep Research Strategy (MANDATORY LOOP) + +### Phase 1 — Planning + +- Break the task into subtopics +- Generate **diverse search queries** +- Cover perspectives: + - technical + - practical + - recent updates + - comparisons + +--- + +### Phase 2 — Search + +Use: ```bash -lynx -dump "https://html.duckduckgo.com/html?q=YOUR+SEARCH+QUERY" +lynx -dump "https://html.duckduckgo.com/html?q=YOUR+QUERY" ``` -### 2. Browse a URL +- Execute queries one by one +- Use multiple variations + +--- + +### Phase 3 — Source Selection + +- Identify **2–10 high-value links** +- Prioritize: + - authoritative sources + - recent content + - technical depth + +--- + +### Phase 4 — Browsing ```bash -lynx -dump "https://example.com/some/path" +lynx -dump "https://example.com" ``` -## How to Navigate Links & Redirects +- Open selected links +- Extract only relevant content + +--- + +### Phase 5 — Extraction & Synthesis + +- Extract key facts +- Group insights by topic +- Detect: + - missing info + - contradictions + - shallow areas + +--- + +### Phase 6 — Gap Detection (CRITICAL) + +Ask yourself: + +- What is still unclear? +- What is missing? +- What needs validation? + +➡️ Generate NEW queries + +--- + +### Phase 7 — Recursive Loop + +Repeat: + +1. New queries +2. New sources +3. Deeper insights + +Continue until **Termination Rule is satisfied** + +--- + +### Phase 8 — Final Synthesis + +- Structure answer clearly +- Provide: + - key insights + - comparisons + - conclusions + +--- + +## Navigation Rules ### Links -When you use `lynx -dump`, the output contains a **References** section at the bottom. -In the main text, numbers in brackets like `[1]` correspond to these links. -### Redirects (e.g., from DuckDuckGo) -DuckDuckGo search results often point to redirect URLs. If `lynx -dump` returns a page with a "REFRESH" link or just a single link, you MUST follow it to get the actual content. +- Use **References section** +- `[1], [2], ...` map to URLs + +--- + +### Redirect Handling (MANDATORY) + +If you see: -**Example of DDG redirect output:** ```text - REFRESH(0 sec): - [1]https://actual-destination.com/article +REFRESH(0 sec): +[1]https://actual-site.com ``` -In this case, execute a new search on the URL provided in `[1]`. -**Protocol**: -- List the URLs from the References section that you want to visit. -- Execute new `lynx -dump` commands for those URLs. +➡️ You MUST follow `[1]` -## Guidelines +--- -- **Token Efficiency**: Web pages contain noise. Focus on the core content. Use specific search queries to find the exact page instead of browsing aimlessly. -- **Concise Summaries**: In multi-turn research, keep your internal notes concise. Do not repeat raw data if you have already extracted the key points. -- **Max Depth**: Typically go 1-2 levels deep from the search results. -- **Data Extraction**: Extract only relevant text. Ignore obvious navigation menus or ads. -- **Synthesis**: Your final output should not be a raw data dump. It must be a structured, factual answer to the user's initial goal. -- **Parallelism**: You can plan multiple `lynx` commands in parallel for different URLs if the Planner allows it. +## Depth Enforcement Rules -## Example Workflow +You MUST: -User Goal: "Research the current status of the RISC-V ecosystem in 2024." +- Avoid stopping after first results +- Use **multiple queries minimum** +- Explore **multiple sources** +- Cross-check important facts +- Prefer **depth over speed** -1. `lynx -dump "https://html.duckduckgo.com/html?q=RISC-V+ecosystem+status+2024"` -2. Identify links for "RISC-V International news", "Phoronix RISC-V benchmarks", and "Wikipedia RISC-V". -3. `lynx -dump "URL_FROM_PREVIOUS_STEP"` for each. -4. Synthesize findings into a categorized report (Hardware, Software Support, Benchmarks, Corporate Adoption). +--- + +## Anti-Patterns (STRICTLY FORBIDDEN) + +❌ One search → immediate answer +❌ Only using snippets when depth is required +❌ Ignoring contradictions +❌ Skipping planning +❌ Not iterating on gaps + +--- + +## Heuristics for “More Research Needed” + +Continue researching if: + +- Answer feels shallow +- Only one source used +- No comparison available +- Missing recent data +- Unverified claims exist + +--- + +## Efficiency Rules + +- Be deep, but not wasteful +- Avoid redundant browsing +- Prefer high-signal sources +- Stop only when confident + +--- + +## Example Workflow (Deep) + +User: "State of RISC-V ecosystem" + +1. Plan: + - hardware + - software + - adoption + - benchmarks + +2. Search: +```bash id="8n0fjf" +lynx -dump "https://html.duckduckgo.com/html?q=RISC-V+ecosystem+2025" +``` + +3. Browse top links + +4. Detect gaps: + - missing benchmarks + +5. New search: +```bash id="dm73x0" +lynx -dump "https://html.duckduckgo.com/html?q=RISC-V+benchmarks+2025" +``` + +6. Repeat until complete + +--- + +## Final Rule + +> Research is **not finished when you find an answer** +> Research is finished when **nothing important is missing** + +> Prefer: +> - completeness over speed +> - synthesis over collection +> - depth over surface \ No newline at end of file