Background Previous studies have been inconclusive regarding the validity and reliability of preference elicitation methods. Objective The aim of this study was to compare the metrics obtained from a discrete choice experiment (DCE) and profile-case best-worst scaling (BWS) with respect to hip replacement. Methods We surveyed the general US population of men aged 45 to 65 years, and potentially eligible for hip replacement surgery. The survey included sociodemographic questions, eight DCE questions, and twelve BWS questions. Attributes were the probability of a first and second revision, pain relief, ability to participate in sports and perform daily activities, and length of hospital stay. Conditional logit analysis was used to estimate attribute weights, level preferences, and the maximum acceptable risk (MAR) for undergoing revision surgery in six hypothetical treatment scenarios with different attribute levels. Results A total of 429 (96%) respondents were included. Comparable attribute weights and level preferences were found for both BWS and DCE. Preferences were greatest for hip replacement surgery with high pain relief and the ability to participate in sports and perform daily activities. Although the estimated MARs for revision surgery followed the same trend, the MARs were systematically higher in five of the six scenarios using DCE. Conclusions This study confirms previous findings that BWS or DCEs are comparable in estimating attribute weights and level preferences. However, the risk tolerance threshold based on the estimation of MAR differs between these methods, possibly leading to inconsistency in comparing treatment scenarios.