soulsurfer — How It Works

How It Works

User Experience

Soulsurfer runs on frontier language models across a purpose-built corpus of tens of millions of cultural documents. Ask questions, follow ideas, build new patterns. Sessions are open-ended with custom reports and flexible outcomes.

Semantic Search

Search results are grouped by content similarity. A query about grief surfaces relevant material even when the word grief does not appear. The corpus spans thousands of screenplays, millions of song lyrics, and tens of millions of Reddit comments.

SQL

Quantitative questions are asked in plain English and the platform writes and executes the SQL against the database to return counts, joins, calculated values, and any concrete numerical result or pattern. If the initial result is incomplete, it reruns automatically, up to eight rounds. Every result returns with a link to the source records that validates it.

Data Visualization

Charts are generated inline from live data at the moment of the query, with no export step. Available types include scatterplots, radar charts, time series, bar charts, heatmaps and network graphs. The platform can select the chart type based on the structure of the question, or it can be user-specified.

Source Citations

Every result links directly to the source record it came from. The platform does not generate or infer data; it retrieves only from the indexed corpus. If a claim appears in the response, the supporting row exists in the database.

Reddit
- 607 subreddits
- 1.25M+ posts
- 20M+ comments
Screenplays
- 2,156 screenplays
- 284K+ scenes
- 71K+ indexed characters
Lyrics
- 1.3M+ songs
- 197K+ artists
- 6.5M+ verses