Does AI and Machine Learning Research Using Human Data Need IRB Review?

6 min read

This is a Hot Topic / Current Thing in digital health. AI and ML teams are working with human data at a scale and pace that the regulatory framework wasn't really built for, and "do I need IRB review for this?" has become something people wonder. The answer is not always yes, but it's not obviously no either, and getting it wrong in either direction can cause real problems.

It comes down to two definitions

Under the Common Rule (45 CFR 46.102), IRB oversight kicks in when you're doing research involving human subjects. Both are defined terms in the regulation, and both have to apply for the rules to apply.

Research is a systematic investigation designed to develop or contribute to generalizable knowledge. If you're building something purely for internal quality improvement and you genuinely have no plans to publish or present the findings, it may fall outside this definition. But a lot of people say "this is just internal" and then six months later want to submit to a conference. If there's any chance the results end up in a paper or a regulatory submission, treat it as research.

Human subjects are living individuals about whom you obtain data through interaction or intervention, or identifiable private information. Most AI/ML projects don't involve interacting with anyone directly. They involve datasets. So the real question is whether the data is identifiable. Can someone, with reasonable effort, link records back to a specific person? If yes, those people are human subjects and the regulations apply. If the data has been truly de-identified, stripped of the 18 HIPAA identifiers with no reasonable basis to believe re-identification is possible, then you're likely outside the definition.

For a broader look at when IRB review is and isn't required, we have a separate post on legal vs. practical requirements.

Identifiability is harder than it looks

Datasets that appear de-identified can still carry re-identification risk. A combination of rare diagnoses, demographic details, and timestamps can make someone uniquely identifiable even without names or MRNs attached. This trips up a lot of ML teams, particularly the ones working with free-text clinical notes. Notes are full of incidental identifying information: provider names, references to specific facilities, unusual case descriptions that narrow the pool to one patient. Automated scrubbing tools miss more than people assume.

If your data was de-identified under HIPAA's Safe Harbor or Expert Determination methods and you have no way to re-link it, you're fine. But if you received a "limited dataset" with dates or zip codes still attached, or if a re-linking key exists somewhere you could access it, the data is identifiable and you're back in human subjects territory.

Secondary data and the exempt pathway

A lot of AI/ML work is secondary analysis: training models on hospital records, insurance claims, or public health registries that were collected for a different purpose. The Common Rule anticipated this. Category 4 under 45 CFR 46.104(d)(4) covers secondary use of identifiable data or biospecimens when certain conditions are met, like the data being publicly available or the researcher not receiving identifiers.

If your project fits Category 4, an IRB can issue an exempt determination instead of a full review. Faster, cheaper, less paperwork. But you still need the IRB to make that call. You can't self-certify exemption. We've written a detailed breakdown of the exempt categories if you want to go deeper.

When the FDA gets involved

If your AI/ML tool is going to be used for clinical decision-making (flagging suspicious lesions on radiology images, predicting sepsis risk from vitals, that sort of thing), it likely qualifies as Software as a Medical Device under the FDA's framework. And if you're running a clinical investigation to generate data for a 510(k), De Novo, or PMA submission, you're in FDA-regulated territory regardless of whether you have federal funding.

FDA clinical investigations are governed by 21 CFR 50 and 21 CFR 56. The Common Rule's exempt categories don't apply. You need IRB review, either expedited or full-board depending on risk. Our post on which regulations apply when goes into the HHS vs. FDA distinction in more detail.

Three scenarios your work could fall under

Diagnostic algorithm trained on de-identified records. You receive a dataset of chest X-rays and radiology reports with all 18 HIPAA identifiers stripped out. No re-identification key exists. You train a pneumothorax detection model. This probably isn't human subjects research, and strictly speaking, no IRB review is required. That said, if you plan to publish, most journals will still want a formal "not human subjects research" determination letter. It's a short process and worth doing preemptively.

NLP model trained on clinical notes. This one is murkier. Say you get access to clinical notes from a health system's data warehouse. They went through a de-identification pipeline, but narrative text is inherently messy. Unusual case descriptions, doctor names, references to specific facilities. Automated scrubbing catches a lot, but not everything. An IRB should assess whether the de-identification is actually adequate here. If the data turns out to be effectively identifiable, you're in human subjects territory. It might still qualify for exempt status under Category 4, but only an IRB can make that determination.

Prospective validation on live patients. You're testing a sepsis prediction algorithm in real time. Clinicians receive alerts based on model output. Patients are directly affected by whether those alerts are right or wrong. This is clearly a clinical investigation involving human subjects. You need IRB review, you need to think through informed consent, and you need a data monitoring plan. If the tool is heading toward an FDA submission, the FDA regulations layer on as well.

The practical case for getting a determination even when you don't have to

Say you're working with truly de-identified public data and you have no federal funding. The Common Rule may not technically apply. But then you try to publish, and the journal asks for evidence of ethical oversight. Or a hospital partner wants documentation before they'll share data with you. Or an investor doing diligence asks how you sourced your training data. We see this pattern a lot: teams skip the IRB step because they don't think they need it, and then spend more time retroactively explaining why they didn't get one than the review itself would have taken. A determination letter from an IRB, even one that says "not human subjects research," handles all of those conversations at once.

Where Tempus IRB fits in

We have experience reviewing AI and ML studies. Exempt determinations for secondary data analysis, expedited reviews for prospective SaMD validations, formal "not human subjects" determinations for teams that just need the letter. Exempt determinations are a flat $1,300, and we turn most around within a day. If you're not sure what level of review your project needs, email us and we can help you sort it out.

This post is for informational purposes only and does not constitute legal advice.

Get more insights in our newsletter!