Politics of Poverty

Ideas and analysis from Oxfam America's policy experts

ReadingMachine: Too much to read, too little time

Posted by
ChatGPT Image Apr 22, 2026, 02_28_43 PM
Theuth and Thamus discuss reading in the library of Alexandria Image generated by ChatGPT

In Plato's Phaedrus, Theuth invents writing and Thamus questions whether it creates wisdom or merely the appearance of wisdom.

ReadingMachine is an experiment in a different question: what happens when reading scales too?

We all know the feeling. A colleague sends a long report with the subject line “Looks interesting.” Your manager forwards an article exposing a new controversy and says: “We should look at this.” You move the email to a folder. You think about the hundreds of reports you haven’t read. You imagine being asked to pivot into a new area you barely know. You calculate how long it would take to get up to speed—and feel that sinking sense that you don’t have the time. You consider locking yourself in a room for a weekend, but know that isn’t realistic. You think of commissioning a literature review on this new area of work, but know it will be slow, expensive, and that by the time it arrives, your inbox will already contain another pile of unread material.

More recently you might have tried putting the documents into your company’s AI tool, but when you see the result, the parts of the literature that you do know have been engaged with too shallowly. You worry about what must be missing from the literature you don’t know.

What you are experiencing isn’t new (except for the AI part...). From the printing press to digital to networked systems, the volume of text has exploded. Reading, however, has not scaled in the same way. Constrained by human effort and attention, reading remains effectively artisanal. The result is that the constraint is no longer access to information. It’s the ability to read it all. Reading has become the bottleneck. The challenge is no longer producing knowledge. It is systematically engaging with what has already been written.

AI tools as the solution?

AI tools offer the real possibility of relief. Language models can read and summarize text at extraordinary speed. They don’t get tired. They don’t get overwhelmed. Tools integrated into document systems can now summarize files and retrieve relevant content across them. They are genuinely useful. But they have limits—especially at scale.

The reason your company’s AI tool didn’t give you what you wanted in the opening example is that current tools are optimized to answer questions. They do this by retrieving a subset of relevant content and generating a response from it. This approach—commonly called Retrieval Augmented Generation (RAG)—is fast and effective. But it comes with trade-offs:

  • Content that isn’t retrieved is omitted
  • Distinct arguments are collapsed into generalizations
  • Disagreements are smoothed over
  • Minority or edge-case positions disappear

Importantly, even when they work, you get an answer—but not the landscape of content that informed it. Crucially, you have no visibility into what was not included. For many domains, that’s not enough. In policy, research, and law (among others), we often don’t want the answer. We want to understand:

  • the range of positions
  • how arguments differ
  • where the disagreements lie
  • what sits at the margins

In other words, we want a map of the literature.

ReadingMachine: A computational methodology for structured corpus reading and large-scale synthesis

ReadingMachine was built for that use case. Instead of retrieving fragments to answer a question, ReadingMachine performs a structured reading pass across an entire corpus and produces a structured map of the material. It operationalizes familiar qualitative research methods—at scale.

Humans define:

  • the research questions
  • the scope of the corpus

The system then performs:

  • systematic reading
  • extraction of claims
  • identification of themes
  • thematic synthesis
  • completeness checks

The result is not an answer. It is a structured representation of the literature. If you want to see what I mean, take a cursory glance at the output of our test run on the literature on Industrial policy. It’s immediately recognizable as different from the output of a standard AI tool.

What makes it different?

Two differences matter.

First, it is designed to parse the entire corpus rather than a retrieved subset of it. This reduces the risk of omission and increases the likelihood that all relevant material is surfaced.

Second, it does not reason for you. It does not decide what is correct, important, or persuasive. It does not collapse disagreement into consensus. Instead, it preserves:

  • granularity of claims
  • distinctions between arguments
  • areas of conflict

What you get is a dense, high-coverage map of the literature—not a simplified answer – that you can reason over to draw conclusions informed by your own expertise.

Overall, the user defines the questions and scope of the corpus, and interprets the resulting structure. The system performs the reading. It is, quite literally, a reading machine—with the user setting the dials that determine what is read and how it is organized.

What is it for?

ReadingMachine is not a replacement for existing AI tools. Those tools are excellent for:

  • exploration
  • rapid iteration
  • quick summaries

ReadingMachine is not. It is:

  • slow (hours to run)
  • relatively expensive (hundreds of dollars over a large corpus)
  • cognitively demanding to read

But when completeness matters—when decisions depend on understanding the full structure of a body of work — it may reduce the time and cost of producing an initial structured reading by one to two orders of magnitude.

Want to get involved?

ReadingMachine is still experimental. The code is open source, but the most important next step is evaluation. The industrial policy corpus demonstrates what the system does; it does not yet demonstrate how well it performs relative to other approaches.

We're therefore looking for domain experts—particularly in international development, political economy, policy analysis, evidence synthesis, and related fields—to review the output. Reviews will be published openly on GitHub and used to help develop a more formal evaluation framework for structured corpus reading.

Otherwise we’d love to hear from you. If you are facing the same reading overwhelm, want to try the tool, have an interest in supporting the codebase, or just have question, do get in touch.

Resources:

  • Evaluation (industrial policy test case) - please participate if you have expertise in international development and political economy.
  • Whitepaper – for those interested in the details of the method (also available on arXiv)
  • Tutorial – for anyone who wants to try and run the code
  • Github repository – for the codebase

Related posts

Blog post

Remaining Steadfast in an Era of Destabilization

More than one year into the second Trump administration, Oxfam America President and CEO Abby Maxman writes about the destabilization of its foreign and domestic policy, and what we can do to avoid getting distracted from our mission.

Follow Politics of Poverty

via RSS feed follow us in feedly via feedly