Research moves fast.
Now you do too.

NovelSights monitors arXiv daily, scores papers for commercial relevance, and surfaces the signals worth building on — delivered to your inbox before the market catches up.

Used by founders and operators tracking AI, quant finance, cybersecurity, and climate tech.

Toward Generalist Autonomous Research via Hypothesis-Tree Refinement

rel 0.97com 0.88

Arbor is an AI framework where a long-running coordinator directs short-lived worker agents to run experiments, track results in a tree structure, and iteratively improve research artifacts—without human supervision. It outperformed Codex and Claude Code on six real research tasks.

▾ Why it matters

Autonomous research agents that compound learnings over time directly address the core bottleneck in AI-assisted R&D: context and strategy loss between iterations. This architecture pattern is a blueprint for commercial AI research automation in drug discovery, ML engineering, and data pipelines.

How it works

1

arXiv monitors your topics daily

NovelSights watches the categories and keywords you care about — across AI, quant finance, cybersecurity, climate, and more.

2

Claude scores every paper

Each paper gets a relevance score, a commercial-potential score, and a plain-English summary. No jargon, no hype.

3

You get the signal, not the noise

Top hits land in your inbox and your dashboard — and the best ideas go straight into your build pipeline.

Real signals, from real digests

Pulled straight from NovelSights production runs — scored, summarized, and translated into commercial terms.

Toward Generalist Autonomous Research via Hypothesis-Tree Refinement

rel 0.97com 0.88

Arbor is an AI framework where a long-running coordinator directs short-lived worker agents to run experiments, track results in a tree structure, and iteratively improve research artifacts—without human supervision. It outperformed Codex and Claude Code on six real research tasks.

▾ Why it matters

Autonomous research agents that compound learnings over time directly address the core bottleneck in AI-assisted R&D: context and strategy loss between iterations. This architecture pattern is a blueprint for commercial AI research automation in drug discovery, ML engineering, and data pipelines.

DeNovoSWE: Scaling Long-Horizon Environments for Generating Entire Repositories from Scratch

rel 0.92com 0.78

A dataset and training approach that teaches AI agents to build entire code repositories from scratch using only documentation, jumping a benchmark score from 5.8% to 47.2% on a whole-repo generation test.

▾ Why it matters

Moving from 'fix a bug' to 'build an entire repo from a spec' is the next frontier for AI coding tools. This could power next-gen copilots that generate full project scaffolds, dramatically reducing time-to-MVP.

Context-Based Adversarial Attacks on AI Code Generators: Vulnerability Analysis and Implications

rel 0.88com 0.82

AI coding assistants can be manipulated into generating insecure code by crafting malicious comments or variable names. Attacks work 10x more often under adversarial conditions, and a defense layer detecting these attacks achieved 89% accuracy with minimal false positives.

▾ Why it matters

AI coding tools are now embedded in enterprise dev workflows. If adversarial inputs can reliably trigger vulnerable code generation, this is a supply chain risk — and a real-time defense layer is commercially viable as security middleware for AI-assisted development.

Be first when we open.