source: arxiv artificial intelligence: sciatlas: a large-scale knowledge graph for automated scientific research

level: research

sciatlas is a large knowledge graph built from over 43 million papers across 26 fields. it contains 157 million entities and 3 billion relationships. the goal is to give ai systems a structured map of scientific knowledge. current search tools rely on keywords or vector similarity and miss deeper logical links. agentic research frameworks often hallucinate or cost too much to run. sciatlas aims to fix this by offering a topological view of how ideas connect.

the graph models papers, authors, concepts, and their links as a network. this lets ai traverse paths between findings, spot gaps, and form hypotheses. the team used automated extraction pipelines to build the graph at scale. they normalized entities across disciplines to enable cross-field reasoning. early tests show the graph helps ai answer complex science questions with fewer errors. it also reduces the need for expensive language model calls by providing precomputed relational data.

sciatlas is designed as a substrate for ai agents doing deep research. it can support tasks like literature review, trend analysis, and discovery. the graph is heterogeneous, meaning it holds many types of nodes and edges. this richness allows for flexible querying and reasoning. the project is open and aims to become a shared resource for the ai and science communities. future work will expand coverage and improve entity resolution.

why it matters: it gives ai a reliable, structured knowledge base to reduce hallucinations and inference costs in scientific research.


source: arxiv artificial intelligence: sciatlas: a large-scale knowledge graph for automated scientific research