In a multi‐center patient study, using different CT scanners, CT‐based finite element (FE) models are utilized to calculate failure loads of femora with metastases. Previous studies showed that using different CT scanners can result in different outcomes. This study aims to quantify the effects of (i) different CT scanners; (ii) different CT protocols with variations in slice thickness, field of view (FOV), and reconstruction kernel; and (iii) air between calibration phantom and patient, on Hounsfield Units (HU), bone mineral density (BMD), and FE failure load. Six cadaveric femora were scanned on four CT scanners. Scans were made with multiple CT protocols and with or without an air gap between the body model and calibration phantom. HU and calibrated BMD were determined in cortical and trabecular regions of interest. Non‐linear isotropic FE models were constructed to calculate failure load. Mean differences between CT scanners varied up to 7% in cortical HU, 6% in trabecular HU, 6% in cortical BMD, 12% in trabecular BMD, and 17% in failure load. Changes in slice thickness and FOV had little effect (≤4%), while reconstruction kernels had a larger effect on HU (16%), BMD (17%), and failure load (9%). Air between the body model and calibration phantom slightly decreased the HU, BMD, and failure loads (≤8%). In conclusion, this study showed that quantitative analysis of CT images acquired with different CT scanners, and particularly reconstruction kernels, can induce relatively large differences in HU, BMD, and failure loads. Additionally, if possible, air artifacts should be avoided.