• OBJECTIVE
    • To develop an artificial intelligence (AI) system capable of classifying and segmenting femoral fractures. To compare its performance against existing state-of-the-art methods.
  • METHODS
    • This Institutional Review Board (IRB)-approved retrospective study did not require informed consent. 10,308 hip x-rays from 2618 patients were retrieved from the hospital PACS. 986 were randomly selected for annotation and randomly split into training, validation, and test sets at the patient level. Two radiologists segmented and classified femoral fractures based on their location (femoral neck, pertrochanteric region, or subtrochanteric region) and grade, using the Evans and Garden scales for neck and pertrochanteric regions, respectively. A YOLOv8 segmentation convolutional neural network (CNN) was trained to generate fracture masks and indicate their class and grade. Classification CNNs were trained in the same dataset for method comparison.
  • RESULTS
    • On the test set, YOLOv8 achieved a Dice coefficient of 0.77 (95% CI: 0.56-0.98) for segmenting fractures, an accuracy of 86.2% (95% CI: 80.77-90.55) for classification and grading, and an AUC of 0.981 (95% CI: 0.965-0.997) for fracture detection. These metrics are on par with or exceed those of previously published AI methods, demonstrating the efficacy of our approach.
  • CONCLUSIONS
    • The high accuracy and AUC values demonstrate the potential of the proposed neural network as a reliable tool in clinical settings. Further, it is the first to provide a precise segmentation of femoral fractures, as indicated by the Dice scores, which may enhance interpretability. A formal evaluation is planned to further assess its clinical applicability.
  • CRITICAL RELEVANCE STATEMENT
    • The proposed system offers high granularity in fracture classification and is the first to segment femoral fractures, ensuring interpretability.
  • KEY POINTS
    • We present the first AI method that segments and grades femoral fractures. The method classifies fractures with fracture location and type. High accuracy and interpretability promise utility in clinical practice.