Data Visiting as Design-Based Governance

April 27, 2026April 27, 2026 ~ Donrich Thaldar

I was recently asked to speak on the topic of “Data Visiting as Design-Based Stewardship Supporting a Multi-Dimensional Governance Configuration.” I want to take issue with a word in that title.

The word is “stewardship.” It is the wrong frame.

Stewardship avoids the language of ownership. A steward keeps something safe. An owner uses something. An owner has legal rights — the right to exploit, to license, to build value. A steward has none of those rights.

Why does this matter? It matters because in Africa, we need institutions — universities, hospitals, biobanks, research councils — to understand themselves as owners of data. Ownership has legal bite. Ownership means you can decide how your data is used, who benefits, and on what terms. If African institutions see themselves merely as stewards — as custodians keeping data safe until someone from the Global North comes to analyse it — then we have not escaped data colonialism. We have institutionalised it.

Data visiting, properly understood, is a tool that empowers data owners. It allows an institution to say: you may analyse our data, but on our terms, in our environment, under our control. That is not stewardship. That is ownership in action.

What data visiting is — and what it is not

Data visiting is a form of data sharing in which data are analysed within the provider’s computing environment, without being physically transferred. The researcher visits the data; the data does not travel to the researcher.

Data visiting is not the entirety of data governance. Ethics committees, data access committees, institutional review boards, data use agreements — all of these remain part of the governance landscape. What data visiting offers is something different: it is a design-based governance tool that can be integrated into a broader governance framework. It gives data owners a configurable technical architecture through which governance decisions can be implemented directly.

A call for terminological convergence

Before going further, a point on terminology. The field must converge on data visiting — not “data visitation,” not other variants. GA4GH has adopted “data visiting.” The academic literature overwhelmingly uses “data visiting.” If we are serious about building a shared governance vocabulary, we cannot afford terminological fragmentation. The concept is hard enough to communicate without muddying it with competing labels. Data visiting is the term. Let us use it consistently.

The one-dimensional trap

Too often, we hear: “We use data visiting” — as if that settles the governance question. It does not.

Consider two systems. In the first, the researcher has full autonomy to run custom code on identifiable data, with unrestricted output — that is data visiting. In the second, the researcher submits a fixed query and receives only reviewed aggregate statistics — that is also data visiting. The governance implications could not be more different.

Saying “we do data visiting” is like saying “we have a contract.” It tells you nothing about the terms. Data visiting is not one thing. It is a configuration space — a multi-dimensional design surface. And if you reduce it to one dimension, you will get your governance wrong.

Seven dimensions, seven governance levers

This is why I developed the Seven-Dimensional Data Visiting Framework — the 7D-DVF. It disaggregates data visiting into seven adjustable dimensions: researcher autonomy, data location, data visibility, the nature of the shared data, output governance, the trust and control model, and auditability and traceability.

Each dimension is a governance lever — a concrete design decision with direct legal and ethical consequences. Researcher autonomy: how much freedom does the visiting researcher have? Data visibility: can they see raw records, or only aggregates? Output governance: are results reviewed before release, or exported freely?

The power of the framework is that it makes the governance configuration legible. An ethics committee reviewing a data visiting proposal can assess each dimension independently and ask: is this calibration proportionate to the risk? And a data owner — an African university, a national biobank — can use these levers to assert control over how their data is accessed, on their terms, in service of their priorities.

Design as governance

This brings me to the central insight. In data visiting, design functions as governance. When you choose to restrict data visibility to query-only access, that is a governance decision implemented through technical design. When you require output review before release, that is governance embedded in the system architecture.

Data visiting gives data owners the ability to embed governance decisions directly into the technical infrastructure. This is not governance layered on top of a system — it is governance built into the system. The 7D-DVF provides the language and the structure to do this rigorously, deliberately, and in a way that serves the interests of the data owner.

For Africa, this is transformative. It means that an institution that owns genomic data can participate in global research collaborations without surrendering control — without the data leaving, without ceding sovereignty, and with every governance parameter configured to build local capacity and contribute to an African bio-economy. Not as shepherds. As owners.

A challenge

Stop treating data visiting as a binary. It is a multi-dimensional configuration space. Stop saying “we do data visiting” as if that answers the governance question. Specify which data visiting — along how many dimensions, calibrated to what risks, in what legal and ethical context. And recognise data visiting for what it is — not a stewardship tool, but an ownership tool. A tool that lets data owners govern on their own terms, build their own capacity, and participate in global science as equals.

The tools exist. Use them.

January 21, 2026 ~ Donrich Thaldar

South Africa’s Pragmatic Turn: Pseudonymized Data and the Future of Health Research

Which AI models actually know South African law?

October 17, 2025October 17, 2025 ~ Donrich Thaldar

In my latest article, I put five of the most popular AI tools through a South African legal obstacle course to see how they perform in reasoning through real legal scenarios. The idea was simple: can generative AI — not trained specifically on South African law — reason like a local lawyer?

The results were illuminating. Some models impressed. Others, well, should probably be held in contempt of court.

The study covered three scenarios drawn from private law:

1. What happens when a dachshund bites someone?

2. Do you have to pay if you refuse to take your bakkie back after it’s been serviced?

3. Who’s liable when a veldfire gets out of control?

This post focuses on that third scenario. If you’d like to see how they handled the sausage dog and the car dispute, you’ll find all the details (and comparative scores) in the full article here.

Setting the scene: fire in the Midlands

Imagine Jacob, a cattle farmer in the KZN Midlands, decides to burn dry grass on his farm to clear it for new growth. His neighbour Maria has warned him, repeatedly, about the risk — the wind tends to carry embers across the fence. Jacob proceeds anyway. Predictably, the fire jumps the fence, damages Maria’s grazing land, and injures two of her prized Nguni cattle.

Maria demands compensation. Jacob says it was an accident.

Now, this is no hypothetical exam question — it’s a legal minefield, blending common law delict and the National Veld and Forest Fire Act 101 of 1998 (NVFFA).

What the law requires (and what AI needs to spot)

At common law, this is a textbook case of actio legis Aquiliae — delictual liability for patrimonial loss. The plaintiff must show five elements: conduct, wrongfulness, fault, causation, and damage.

But the NVFFA raises the stakes. Under section 34(1), there’s a statutory presumption of negligence if a veldfire spreads from one property and causes harm. This flips the burden: unless Jacob can show he took reasonable precautions, he is presumed negligent. That statutory overlay is not optional — it defines how a South African court would approach the matter.

In this scenario, I was especially interested to see which AI models could:

Identify actio legis Aquiliae as the correct cause of action,
Recognise the relevance of the NVFFA,
Incorporate the statutory presumption into their analysis,
And apply it coherently to the facts.

Spoiler: only one of them did all of this.

Claude — the star pupil

Claude not only identified the actio legis Aquiliae correctly, but it also engaged with the NVFFA in a legally accurate way. It recognised the statutory presumption of negligence, discussed section 12(1) (the duty to maintain firebreaks), and even flagged whether Jacob belonged to a fire protection association.

Claude’s analysis wasn’t just correct — it was legally structured, cited real case law, and anticipated counterarguments. This is the only model I would even consider letting draft a first-year exam answer — let alone a client memo.

ChatGPT — clever, but forgot about statute law

ChatGPT correctly identified the delictual claim and applied the five elements sensibly, even citing Kruger v Coetzeeappropriately for the test of negligence. But it missed the NVFFA entirely. That omission significantly weakens the analysis, as it ignores the shift in evidentiary burden and the statutory duty of care.

Still, its output was coherent, reasonably structured, and persuasive — provided you don’t ask it to deal with statutes unless you explicitly prompt it.

DeepSeek — close, but misses the mark

DeepSeek followed a similar pattern to ChatGPT: good grasp of delictual structure, but no engagement with the statute. It also relied on real case law, though its application of legal principles was occasionally vague. Competent, but not reliable if the issue involves anything beyond textbook delict.

Grok and Gemini — not ready for the bar

Both Grok and Gemini performed poorly. Grok referred to a “delict of negligence” — a fundamental misunderstanding of how South African law frames fault. Neither model identified actio legis Aquiliae. Neither mentioned the NVFFA. Case law citations were weak or missing. These models felt like overseas exchange students bluffing their way through a South African law tutorial. Politely put: not helpful.

What this tells us about AI in legal research

The veldfire scenario offers a revealing stress test for generative AI. It shows that while large models can replicate form, their depth of legal reasoning varies wildly — especially when statutory law modifies common-law doctrine.

A few takeaways:

Don’t assume AI knows the law. Sometimes it does, sometimes it does not. And sometimes it is only partial.
Citation ≠ comprehension. Some models cite real cases but don’t understand them; others hallucinate entirely.
Structured reasoning is rare. Only one of the five models showed a true grasp of how common law, statute, and fact must interact in legal analysis.

Want more?

The full article contains all three scenarios, a comparative table of how the five models performed across seven legal criteria, and more detailed observations about hallucinated case law, doctrinal confusion, and where AI shows promise (and where it absolutely doesn’t).

🔗 Read the full article here.

Authenticity in the age of genetic engineering: lessons from a dachshund

July 8, 2025 ~ Donrich Thaldar

Once upon a time, wolves roamed the wild—majestic, cunning, and fierce. Fast forward a few millennia, and my dachshund, Zazu, naps in a sunbeam, snorts at his food bowl when dinner is two minutes late, and experiences existential dread when the neighbour’s cat walks by. His legs are too short to run fast, his bark too comical to intimidate, and his idea of hunting involves dragging his plush toy into the laundry basket. And yet—is he less authentic than his wolf ancestors?

This isn’t just a quirky thought about a pampered pet. It’s a window into a serious philosophical question: does genetic engineering threaten authenticity?

Critics of genetic engineering often say yes. Their argument is that if children’s genomes have been engineered—if their traits were chosen—then they somehow are not really themselves. That genetic intervention imposes a kind of inauthenticity. But this assumes that authenticity flows from genetic purity, from being the way evolution left us. Which brings us back to Zazu.

Zazu is not “natural.” He is the product of generations of deliberate human selection. Yet he is not haunted by the ghost of lupine dignity. He does not need to chase elk across the tundra to be real. He lives his life—ears flapping, legs flying, heart full. And therein lies the point: authenticity is not about where you come from. It’s about how you live.

In my recent article, “Existentialism and My ‘Postwolf’ Dachshund: Authenticity in the Age of Genetic Engineering” (Bioethics, 2025), I argue that we should understand authenticity through an existentialist lens, not an essentialist one. Existentialist philosophers like Heidegger and Sartre teach us that we are not defined by our origins, but by how we engage with the conditions of our existence. We are thrown into life—into a body, a time, a genome—and from there, we must choose. We must live.

From this view, genetic engineering does not inherently undermine authenticity. A genetically engineered trait is just one more part of the thrownness we must navigate. What matters is not whether a trait was chosen, but whether the individual can engage with it meaningfully, shape their life freely, and take ownership of who they become.

Of course, this doesn’t mean that all genetic engineering is ethically unproblematic. It means we need to evaluate interventions based on whether they support or undermine the individual’s future autonomy and capacity for authentic living. In my article, I propose combining two guiding principles: Procreative Beneficence (choose traits that enhance flourishing) and Procreative Non-Maleficence (don’t impose harmful constraints). These principles together offer a framework for ethical genetic decision-making—neither blindly optimistic nor paralyzed by purity anxiety.

What existentialism reminds us is that there is no “true essence” to be protected in a pristine genome. The idea of a fixed human nature—like the idea of Zazu needing to recover his “authentic wolfness”—is philosophically brittle. We are not wolves, and that’s okay. Nor are we prisoners of our DNA. We are free beings, capable of shaping meaning from the givens of our lives—whether those givens arrived through nature, nurture, or a CRISPR edit.

If you’re interested in the full argument—including responses to Habermas and Fukuyama, a reconstruction of authenticity through Heidegger and Sartre, and the introduction of a dachshund named Zazu as an unlikely philosophical provocateur—you can read the article here:

🔗 https://doi.org/10.1111/bioe.13428

Can ChatGPT-4 draft complex legal contracts? I put it to the test

April 13, 2025 ~ Donrich Thaldar

The legal profession is changing—and not slowly. With generative AI models like ChatGPT-4 becoming increasingly capable, it’s no longer a matter of if they’ll assist legal professionals, but how. So I decided to put ChatGPT-4 through its paces with one of the more demanding legal documents in my field: the Data Transfer Agreement (DTA) for health research.

DTAs are not your everyday consumer contracts. They deal with sensitive personal data, cross-border data flows, and legal compliance in an increasingly regulated landscape. They are intricate, specialist documents, and they’re rarely found in the public domain—meaning they likely weren’t heavily represented in the data ChatGPT-4 was trained on. In short, if generative AI can draft a decent DTA, that would be something worth paying attention to.

So I ran an experiment. First, I used ChatGPT-4 to generate an outline of a typical DTA. Then, I fed it each clause heading and asked for a detailed version of each clause. The result? A 6,800-word draft DTA—coherent, reasonably structured, and almost impressive.

But not perfect.

In my article just published in Humanities and Social Sciences Communications, I dig into where ChatGPT-4 excels and where it still falls short. Yes, it can generate most of the expected clauses. But its grasp of legal precision, clarity, and especially data protection compliance still leaves room for improvement. There are issues with redundancy, inconsistent use of terms, and ambiguous concepts like “derivative works.” And although it mentions security and compliance, it doesn’t always go deep enough to meet best-practice standards.

The takeaway? ChatGPT-4 is not ready to replace lawyers. But it is already a powerful tool in the legal drafting toolbox—especially when used strategically. My two-stage approach (first outlining, then refining clause-by-clause) is a method I believe many legal professionals can adopt to streamline their work while maintaining full control and accountability.

Most importantly, this experiment raised a set of broader questions—ethical ones. Should clients be told when AI was involved in drafting? Who is ultimately responsible for errors? And how do we ensure fairness in a world where algorithmic bias can quietly shape outcomes?

The answers aren’t simple. But one thing is clear: the future of legal drafting will be a collaboration between human lawyers and artificial intelligence. This study is a small, practical step in figuring out what that collaboration should look like.

If you’re curious to see what ChatGPT-4 produced—or if you’re a legal professional wondering how to make the most of AI without compromising on quality—you’ll find the full article here:

📄 Read the article