Pipeline
1
Check source routes
The project monitors a set of public sources likely to contain relevant material: news outlets, NGO publications, press freedom organisations, human rights reports, and selected PDF publications. Sources are reached through different discovery strategies depending on what each one exposes — XML sitemaps, WordPress REST APIs, RSS feeds, Brave Search results, site-specific APIs (Meduza, RFA), and direct browser-based page scraping. Before a larger run, a pre-flight check verifies that each source's discovery path, API keys, browser profiles, and connection patterns are still working as expected. Routes that break are flagged before the scraper spends time on them.
2
Find or receive leads
Possible cases come from automated discovery across the monitored source set. Cases can also be manually suggested by readers, researchers, or affected people through the submission form on this site. A suggested case is only a lead. It must still be checked against public sources and manually approved before publication.
3
Filter for financial censorship
A case must involve a financial restriction — a frozen account, closed account, blocked payment, loss of payment infrastructure, restricted fundraising, asset seizure, or comparable financial pressure. The restriction must be connected to speech, association, journalism, political activity, civil society work, protest, religion, identity, or another public-interest context. Keyword screening is intentionally broad at this stage to avoid missing relevant leads. Not every article that passes this filter will become a published case.
4
Extract structured records
When a lead appears relevant, structured fields are extracted from the article text: event date, country, target name, actor, method, category, summary, source URL, and confidence. The extraction uses the project's explicit definition of financial censorship, a fixed schema, and instructions that prefer null over guessing — a missing date is left blank rather than approximated from context. Claude handles the extraction as a drafting tool. Batches of articles are sent to the API together for efficiency. The AI does not decide whether a case is published.
5
Deduplicate candidates
Before and after manual review, candidates are compared against each other and against existing records in the database to catch overlapping cases — the same underlying event reported by multiple outlets, or a new batch that overlaps with something already published. Duplicates are merged rather than dropped, preserving source coverage from each report. The deduplication step uses Claude to identify clusters of likely-same events within a country group, then applies an anchorless merge that carries forward all confirmed facts from all sources.
6
Review manually
Each candidate case is reviewed by a person. The reviewer checks whether the source supports the claim, whether the record fits the project's definition, whether the summary is fair, and whether the structured fields are usable. An AI pre-screening pass can flag uncertain cases or low-confidence extractions before they reach this step, but it does not replace the human decision. Records can be accepted with edits, accepted as-is, or rejected. Nothing reaches the public database without a person's explicit approval.
7
Fill remaining gaps
After approval, an enrichment pass fills fields that were left blank during extraction. Event dates are inferred from the article text when not explicitly stated — the date the account was frozen or the payment was blocked, not necessarily the date the article was published. If the article gives no usable date, the published date is used as a conservative upper bound. Missing country fields are filled from context. Individual names that were extracted without role or nationality context are enriched with a short descriptive prefix so they are readable without the surrounding article. Only accepted records are enriched. The enriched fields are marked as inferred where relevant.
8
Publish approved cases
Only manually approved and enriched cases are exported to the public site. The exported dataset feeds the map, case table, and front-page summary. A dated version snapshot is saved each time the dataset is exported, so researchers can cite the version they used. Internal notes, AI confidence scores, rejected leads, and unresolved edge cases are not included in the public export. Cases do not appear publicly until a person has approved them.
Working Definition
The project uses financial censorship as a broad working term for politically or reputationally motivated restrictions on financial access. The public dataset systematically tracks closures: cases where a specific financial restriction happened to a named target at a documentable moment. Exclusion and sanction are recognised as related forms of financial censorship, and some documented cases may appear in the record, but they are not yet tracked systematically as their own complete sections.
Every inclusion decision involves judgment about whether a restriction was politically or reputationally motivated. Reasonable people will disagree with some of these judgments. These criteria explain how I make the call. If a case is wrong, incomplete, or unfairly described, the correction process is there so it can be fixed.
Corrections And Disputes
Anyone can submit a correction, removal request, or source note using the links in the footer. A disputed case normally remains visible while it is under review, unless there is an obvious factual error, privacy concern, safety concern, or source problem that makes temporary removal appropriate. Possible outcomes include correcting the summary, changing structured fields, adding a better source, moving the record out of public view, or removing it.
Rejected submissions are not published and submitters are not individually notified. If you believe a rejection was incorrect, contact the maintainer directly with public sources that support the case.
Update Rhythm
Automated source checks are intended to run monthly for routes that are reliable and easy to query. Manual research and review continue outside that schedule. Because every public record requires human approval, discovery and publication do not happen at the same speed. The last updated date on the front page reflects when the dataset was last exported, not when discovery last ran. Dated dataset snapshots are kept in the public export so researchers can cite the version they used.
Limitations
The dataset can only show what public reporting makes visible. That means it will always miss some of the quietest cases: informal warnings, unexplained account denials, private correspondence between banks and regulators, and restrictions in places where reporting itself is dangerous. Source coverage is uneven; English-language and Western European outlets are better represented than sources in other languages or regions. Motives can also be ambiguous: a restriction that looks politically motivated may have a compliance or fraud explanation, and the reverse is also possible. The monitor records what public sources report, not a definitive account of motive. There is often a lag between an incident and its public reporting, and some cases may later be corrected, reclassified, or removed as better information becomes available.
How To Cite
Kokkomäki, V. (2026). Financial Censorship Monitor [dataset]. debankd.org.