OBJECTIVES: To compare two consensus development methods commonly used for developing clinical guidelines in terms of the judgments produced, closeness of consensus, amount of change between rounds, concordance with research evidence and reliability. METHODS: In all, 213 general practitioners and mental health professionals from England participated in four Delphi and four nominal groups. They rated the appropriateness of four treatments (cognitive behavioural therapy [CBT], behavioural therapy [BT], brief psychodynamic interpersonal therapy [BPIT] and antidepressants) for three conditions. First, participants rated the appropriateness of interventions independently, using a postal questionnaire. For nominal groups, the ratings were fed back and discussed at a meeting, and then group members privately completed the questionnaire again. For Delphi groups, there was feedback but no discussion, and the entire process was conducted by postal questionnaire. RESULTS: The effect of consensus method on final ratings varied with therapeutic intervention, with nominal groups rating CBT and antidepressants more favourably than Delphi groups. Consensus was closer in the nominal than in the Delphi groups in both rounds. There was no overall difference between groups in their concordance with research evidence (odds ratio 1.13, 95% confidence interval 0.79-1.61). In this study, the Delphi method was more reliable (kappa coefficients 0.88 and 0.89 compared with 0.41 and 0.65 for nominal groups). CONCLUSIONS: The advantages of nominal groups (more consensus; greater understanding of reasons for disagreement) could be combined with the greater reliability of the Delphi approach by developing a hybrid method.