Depression screening tools play a central role in modern mental health care. Among the most widely used instruments worldwide is the Patient Health Questionnaire, commonly referred to as the PHQ. Versions such as the PHQ-2, PHQ-8, and PHQ-9 are embedded in primary care, psychiatric clinics, large scale research studies, and digital health platforms. They are often treated as reliable indicators of depressive symptom severity and are used to guide diagnosis, treatment decisions, and even public health policy.
However, new research published in JAMA Psychiatry raises serious concerns about whether the PHQ actually measures what it claims to measure. According to the study, a large proportion of both community members and clinical patients misunderstand the questionnaire instructions. This misinterpretation may undermine the validity of PHQ scores and complicate how clinicians and researchers interpret depression severity.
This article explores the study’s findings, explains why the distinction between symptom frequency and symptom burden matters, and discusses the broader implications for mental health assessment.
The Patient Health Questionnaire was developed to provide a brief, standardized way to assess depressive symptoms. The PHQ-9, the most commonly used version, asks respondents how often they have experienced nine core symptoms of depression over the past two weeks. Response options range from “not at all” to “nearly every day.”
The instructions tell respondents to rate symptoms based on how often they were “bothered” by them. This wording is subtle but important. The PHQ is intended to measure severity, not merely the presence or frequency of symptoms. A symptom that occurs often but causes little distress should theoretically be scored differently than a symptom that occurs less frequently but is deeply distressing.
In practice, clinicians and researchers usually assume that PHQ scores represent symptom frequency and severity in a consistent way across individuals. The new study challenges that assumption.
The study titled Interpretation Issues With the Patient Health Questionnaire Instructions was published online on December 17, 2025, in JAMA Psychiatry. The authors examined whether respondents interpret PHQ instructions as intended.
Researchers analyzed data from two groups:
Participants completed the PHQ-8 and then answered additional questions designed to assess how they interpreted the questionnaire instructions.
Participants were asked three critical questions:
These questions allowed researchers to determine whether respondents understood and applied the instructions consistently.
The results revealed a striking pattern of misunderstanding.
In the hypothetical oversleeping scenario, only:
When participants were asked how they had actually responded to the PHQ:
When asked how they would respond in the future, the numbers remained similarly low. This suggests that misinterpretation is not random or temporary but rather stable over time.
In other words, most respondents were not answering the PHQ based on how often symptoms bothered them. Instead, many were answering based on raw frequency, perceived severity, or a personal mixture of both.
At first glance, the difference between symptom frequency and symptom burden may seem minor. In reality, it has major implications.
Consider two individuals:
If both individuals answer the PHQ based on frequency alone, Person A may appear more depressed. If they answer based on how bothered they feel, Person B may score higher. If respondents interpret the questionnaire differently, their scores become difficult to compare.
This inconsistency undermines one of the core purposes of standardized screening tools, which is to allow meaningful comparisons across individuals, settings, and time.
The findings raise important concerns for clinicians who rely on the PHQ to guide care.
If patients interpret the PHQ differently, clinicians may overestimate or underestimate depression severity. This could lead to inappropriate treatment decisions, such as prescribing medication when it is not needed or failing to intervene when support is necessary.
The PHQ is often used to track changes in symptoms over time. If a patient’s interpretation of the questionnaire shifts or differs from the clinician’s assumptions, changes in scores may not reflect true changes in mental health.
Clinicians may assume that PHQ scores reflect distress, while patients may be reporting frequency alone. This mismatch can affect conversations about treatment goals and priorities.
The study also has far reaching implications for mental health research.
Large epidemiological studies often use the PHQ to estimate depression prevalence. Clinical trials frequently use PHQ scores as primary or secondary outcomes. Digital mental health tools rely on PHQ data to personalize interventions.
If respondents interpret the questionnaire inconsistently, the validity of these findings is called into question. Differences in PHQ scores across populations or studies may reflect differences in interpretation rather than true differences in depression severity.
This issue is especially important when comparing clinical and non clinical samples. The study found that misinterpretation was even more common among individuals with higher depression severity, which may distort comparisons between groups.
Not necessarily. The PHQ remains a valuable tool, particularly because it is brief, accessible, and well validated in many contexts. However, this study suggests that its instructions may not be as clear or intuitive as previously assumed.
Rather than abandoning the PHQ, the findings point to the need for improvement. Possible solutions include:
Beyond the PHQ itself, this study highlights a broader challenge in mental health assessment. Psychological constructs such as depression are subjective and multidimensional. Capturing them with brief self report tools is inherently complex.
Assumptions about shared understanding between questionnaire developers, clinicians, and patients may not hold true. Even small ambiguities in wording can have large effects on how data are generated and interpreted.
As mental health care becomes increasingly data driven, ensuring that measurement tools are both valid and interpretable is more important than ever.
The new JAMA Psychiatry study provides compelling evidence that the Patient Health Questionnaire is widely misunderstood by both community members and clinical patients. Most respondents do not interpret the instructions as intended, and this misinterpretation appears to be stable over time.
These findings raise important questions about how PHQ scores are used in clinical decision making, research, and public health surveillance. While the PHQ remains a valuable tool, clinicians and researchers should be cautious in assuming that scores reflect symptom burden in a consistent way.
Ultimately, improving mental health assessment will require clearer tools, better communication, and a deeper appreciation of how individuals understand and report their experiences.
Panayiotou M, Razum J, Eisele G, et al. Interpretation Issues With the Patient Health Questionnaire Instructions. JAMA Psychiatry. Published online December 17, 2025. doi:10.1001/jamapsychiatry.2025.3796
This article is for informational and educational purposes only. It does not constitute medical advice, diagnosis, or treatment. Readers should consult qualified healthcare professionals regarding mental health concerns or clinical decision making. The interpretations presented here are based on a single published study and should be considered within the broader scientific literature.
