Published on January 4, 2026

Why the PHQ Depression Questionnaire May Be Misleading Clinicians and Researchers

A critical look at new evidence questioning one of the most widely used depression screening tools

Depression screening tools play a central role in modern mental health care. Among the most widely used instruments worldwide is the Patient Health Questionnaire, commonly referred to as the PHQ. Versions such as the PHQ-2, PHQ-8, and PHQ-9 are embedded in primary care, psychiatric clinics, large scale research studies, and digital health platforms. They are often treated as reliable indicators of depressive symptom severity and are used to guide diagnosis, treatment decisions, and even public health policy.

However, new research published in JAMA Psychiatry raises serious concerns about whether the PHQ actually measures what it claims to measure. According to the study, a large proportion of both community members and clinical patients misunderstand the questionnaire instructions. This misinterpretation may undermine the validity of PHQ scores and complicate how clinicians and researchers interpret depression severity.

This article explores the study’s findings, explains why the distinction between symptom frequency and symptom burden matters, and discusses the broader implications for mental health assessment.

What Is the Patient Health Questionnaire and Why It Matters

The Patient Health Questionnaire was developed to provide a brief, standardized way to assess depressive symptoms. The PHQ-9, the most commonly used version, asks respondents how often they have experienced nine core symptoms of depression over the past two weeks. Response options range from “not at all” to “nearly every day.”

The instructions tell respondents to rate symptoms based on how often they were “bothered” by them. This wording is subtle but important. The PHQ is intended to measure severity, not merely the presence or frequency of symptoms. A symptom that occurs often but causes little distress should theoretically be scored differently than a symptom that occurs less frequently but is deeply distressing.

In practice, clinicians and researchers usually assume that PHQ scores represent symptom frequency and severity in a consistent way across individuals. The new study challenges that assumption.

Overview of the JAMA Psychiatry Study

The study titled Interpretation Issues With the Patient Health Questionnaire Instructions was published online on December 17, 2025, in JAMA Psychiatry. The authors examined whether respondents interpret PHQ instructions as intended.

Study design and participants

Researchers analyzed data from two groups:

  • A general population sample of 503 adults recruited via Amazon Mechanical Turk
  • A clinical sample of 349 participants with moderate to high depression severity from the OPTIMA study, which focuses on digital phenotyping and anhedonia

Participants completed the PHQ-8 and then answered additional questions designed to assess how they interpreted the questionnaire instructions.

Key questions assessed

Participants were asked three critical questions:

  1. How they would respond to a PHQ sleep item in a hypothetical scenario where they overslept nearly every day but were not bothered by it
  2. Whether their previous PHQ responses were based on symptom frequency, how bothered they felt, or both
  3. How they would answer the PHQ in the future using those same criteria

These questions allowed researchers to determine whether respondents understood and applied the instructions consistently.

Key Findings: Widespread Misinterpretation

The results revealed a striking pattern of misunderstanding.

In the hypothetical oversleeping scenario, only:

  • 54.7 percent of participants in the general population sample interpreted the PHQ as intended
  • 15.5 percent of participants in the clinical sample did so

When participants were asked how they had actually responded to the PHQ:

  • Only 21.3 percent of the general population sample reported following the instructions correctly
  • Only 11.7 percent of the clinical sample did the same

When asked how they would respond in the future, the numbers remained similarly low. This suggests that misinterpretation is not random or temporary but rather stable over time.

In other words, most respondents were not answering the PHQ based on how often symptoms bothered them. Instead, many were answering based on raw frequency, perceived severity, or a personal mixture of both.

Why Frequency vs Bother Matters in Depression Measurement

At first glance, the difference between symptom frequency and symptom burden may seem minor. In reality, it has major implications.

Consider two individuals:

  • Person A sleeps poorly nearly every night but has adapted to it and feels only mildly distressed
  • Person B has difficulty sleeping a few nights a week but finds it extremely distressing and disruptive

If both individuals answer the PHQ based on frequency alone, Person A may appear more depressed. If they answer based on how bothered they feel, Person B may score higher. If respondents interpret the questionnaire differently, their scores become difficult to compare.

This inconsistency undermines one of the core purposes of standardized screening tools, which is to allow meaningful comparisons across individuals, settings, and time.

Implications for Clinical Practice

The findings raise important concerns for clinicians who rely on the PHQ to guide care.

Diagnostic accuracy

If patients interpret the PHQ differently, clinicians may overestimate or underestimate depression severity. This could lead to inappropriate treatment decisions, such as prescribing medication when it is not needed or failing to intervene when support is necessary.

Monitoring treatment progress

The PHQ is often used to track changes in symptoms over time. If a patient’s interpretation of the questionnaire shifts or differs from the clinician’s assumptions, changes in scores may not reflect true changes in mental health.

Shared decision making

Clinicians may assume that PHQ scores reflect distress, while patients may be reporting frequency alone. This mismatch can affect conversations about treatment goals and priorities.

Implications for Research and Public Health

The study also has far reaching implications for mental health research.

Large epidemiological studies often use the PHQ to estimate depression prevalence. Clinical trials frequently use PHQ scores as primary or secondary outcomes. Digital mental health tools rely on PHQ data to personalize interventions.

If respondents interpret the questionnaire inconsistently, the validity of these findings is called into question. Differences in PHQ scores across populations or studies may reflect differences in interpretation rather than true differences in depression severity.

This issue is especially important when comparing clinical and non clinical samples. The study found that misinterpretation was even more common among individuals with higher depression severity, which may distort comparisons between groups.

Does This Mean the PHQ Is Useless?

Not necessarily. The PHQ remains a valuable tool, particularly because it is brief, accessible, and well validated in many contexts. However, this study suggests that its instructions may not be as clear or intuitive as previously assumed.

Rather than abandoning the PHQ, the findings point to the need for improvement. Possible solutions include:

  • Revising the wording of instructions to clearly emphasize symptom burden or frequency
  • Providing examples to clarify how respondents should interpret questions
  • Training clinicians to discuss how patients answered the questionnaire
  • Using complementary assessment methods such as clinical interviews or ecological momentary assessments

A Broader Lesson About Mental Health Measurement

Beyond the PHQ itself, this study highlights a broader challenge in mental health assessment. Psychological constructs such as depression are subjective and multidimensional. Capturing them with brief self report tools is inherently complex.

Assumptions about shared understanding between questionnaire developers, clinicians, and patients may not hold true. Even small ambiguities in wording can have large effects on how data are generated and interpreted.

As mental health care becomes increasingly data driven, ensuring that measurement tools are both valid and interpretable is more important than ever.

Conclusion

The new JAMA Psychiatry study provides compelling evidence that the Patient Health Questionnaire is widely misunderstood by both community members and clinical patients. Most respondents do not interpret the instructions as intended, and this misinterpretation appears to be stable over time.

These findings raise important questions about how PHQ scores are used in clinical decision making, research, and public health surveillance. While the PHQ remains a valuable tool, clinicians and researchers should be cautious in assuming that scores reflect symptom burden in a consistent way.

Ultimately, improving mental health assessment will require clearer tools, better communication, and a deeper appreciation of how individuals understand and report their experiences.

Source

Panayiotou M, Razum J, Eisele G, et al. Interpretation Issues With the Patient Health Questionnaire Instructions. JAMA Psychiatry. Published online December 17, 2025. doi:10.1001/jamapsychiatry.2025.3796

Disclaimer

This article is for informational and educational purposes only. It does not constitute medical advice, diagnosis, or treatment. Readers should consult qualified healthcare professionals regarding mental health concerns or clinical decision making. The interpretations presented here are based on a single published study and should be considered within the broader scientific literature.

Share this post

Explore Related Articles for Deeper Insights

Health Insurance in Georgia for 2026 | Rejoy Health
Health insurance can be complex. If you live in Georgia, deciding between public and private coverag...
View
Health Insurance Companies in Alabama in 2026: Ultimate Guide to Coverage, Costs, Private and Public Plans
Health insurance companies in Alabama in 2026 are evolving as costs rise, new carriers enter the mar...
View
Breastfeeding Tied to Lower Odds of Long‑Term Maternal Depression and Anxiety: Insights from a Decade‑Long Irish Cohort
Breastfeeding Tied to Lower Odds of Long‑Term Maternal Depression and Anxiety: Insights from a Decade‑Long Irish Cohort
Breastfeeding may be associated with lower long‑term risk of maternal depression and anxiety. A 10‑y...
View

To get more personalized answers,
download now

rejoy-heath-logo
Company

Your trusted health companion, delivering personalized and precise answers in real-time.