Skip to main content

What are NodeSets?

A NodeSet lets you group parts of your AI memory at the dataset level. You create them as a simple list of tags when adding data to Cognee: await cognee.add(…, node_set=[“projectA”,“finance”]) These tags travel with your data into the knowledge graph, where they become first-class nodes connected with belongs_to_set edges — and you can later filter searches to only those subsets.

How they flow through Cognee

  • Add:
    • NodeSets are attached as simple tags to datasets or documents
    • This happens when you first ingest data
  • Cognify:
    • carried into Documents and Chunks
    • materialized as real NodeSet nodes in the graph
    • connected with belongs_to_set edges
  • Search:
    • NodeSets act as entry points into the graph
    • Queries can be scoped to only nodes linked to specific NodeSets
    • This lets you search within a tagged subset of your data
  • Memify:
    • The default memify pipeline creates the coding_agent_rules node set containing derived coding rules
    • The session persistence pipeline creates the user_sessions_from_cache node set from cached conversation history

Why they matter

  • Provide a lightweight way to organize and tag your data
  • Enable graph-based filtering, traversal, and reporting
  • Ideal for creating project-, domain-, or user-defined subsets of your knowledge graph

Example

import asyncio
import cognee

async def main():
    # reset Cognee’s memory and metadata for a clean run
    await cognee.prune.prune_data()
    await cognee.prune.prune_system(metadata=True)

    # add a document linked only to the "AI_Memory" node set
    await cognee.add(
        "Cognee builds AI memory from raw documents.",
        node_set=["AI_Memory"]
    )

    # add a document linked to both "AI_Memory" and "Graph_RAG" node sets
    await cognee.add(
        "Cognee combines vector search with graph reasoning.",
        node_set=["AI_Memory", "Graph_RAG"]
    )

    # build the knowledge graph by extracting entities and relationships
    await cognee.cognify()

if __name__ == "__main__":
    asyncio.run(main())

What just happened?

  • You reset Cognee’s memory so you’re working with a clean graph.
  • You added two documents, each tagged with one or more NodeSet labels.
    • The first document is only linked to AI_Memory.
    • The second document is linked to both AI_Memory and Graph_RAG.
  • When you ran cognify(), Cognee:
    • Created NodeSet nodes (AI_Memory, Graph_RAG) in the graph.
    • Attached each document to the corresponding NodeSets.
    • Extracted entities and relationships from the documents, then linked those entities back to the same NodeSets.
This means the tags you add flow down into the extracted entities:
  • “Cognee” appears in both documents → connects to both NodeSets.
  • “AI memory” appears only in the first → connects only to AI_Memory.
  • “Vector search” appears only in the second → connects to both since that document belongs to AI_Memory and Graph_RAG.
Your NodeSets now unlock powerful search and navigation capabilities:
  • You can filter searches by NodeSet.
  • You can scope queries to specific NodeSets.
  • You can navigate data by project or domain using NodeSets.
When filtering with multiple NodeSet names, you can control matching behavior by choosing whether results must be connected to all selected names or to any selected name; by default, Cognee uses the any selected name behavior (OR-style matching). This behaviour is controlled by passing the wanted value (AND or OR) via the node_name_filter_operator parameter in the search function.

Add

Where NodeSets are first attached

Cognify

How NodeSets are promoted into graph nodes

Search

Use NodeSets as anchors in queries