Next Article in Journal
Meta-Analysis of Cardiovascular Risk Factors in Offspring of Preeclampsia Pregnancies
Previous Article in Journal
Assessment of the Accuracy, Usability and Acceptability of a Rapid Test for the Simultaneous Diagnosis of Syphilis and HIV Infection in a Real-Life Scenario in the Amazon Region, Brazil
 
 
Article
Peer-Review Record

AI: Can It Make a Difference to the Predictive Value of Ultrasound Breast Biopsy?

Diagnostics 2023, 13(4), 811; https://doi.org/10.3390/diagnostics13040811
by Jean L. Browne 1, Maria Ángela Pascual 1,*, Jorge Perez 1, Sulimar Salazar 1, Beatriz Valero 1, Ignacio Rodriguez 1, Darío Cassina 1, Juan Luis Alcázar 2, Stefano Guerriero 3 and Betlem Graupera 1
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Diagnostics 2023, 13(4), 811; https://doi.org/10.3390/diagnostics13040811
Submission received: 19 January 2023 / Revised: 17 February 2023 / Accepted: 18 February 2023 / Published: 20 February 2023
(This article belongs to the Section Medical Imaging and Theranostics)

Round 1

Reviewer 1 Report

The work compares the recommendations for biopsy after ultrasound diagnosis by human and AI reader. The idea is fine, but the chosen AI software is a problem. It does not specify the AI methods used in the predictions (neural networks? Fuzzy logic expert systems? What choice of them?...) - this makes the methods not entirely scientific. I'd strongly encourage to contact the program developer for the information about algorithms used, so we have the chance to believe that it is based on solid science.

Apart from that, the software is already used in diagnostics and we can't change this. So no matter how it works, it makes sense to check whether it is useful. The results show that in case of more severe BI-RADS ratings, the software can limit the number of unnecessery biopsies, but for low BI-RADS ratings it can miss some cases.

I find Table 3 difficult to read. I think it could be better to show column charts that display percentage of malignant biopsies for each combination of BI-RADS and KOIOS score. It would allow to check for dependency of these two measures and check the joint predictive strength.

In Tables 1, 2 there is no description of the meaning of B (biopsies?) and M (malignant?) symbols.

Author Response

Thank you very much for your comments.

 

Point 1: The work compares the recommendations for biopsy after ultrasound diagnosis by human and AI reader. The idea is fine, but the chosen AI software is a problem. It does not specify the AI methods used in the predictions (neural networks? Fuzzy logic expert systems? What choice of them?...) - this makes the methods not entirely scientific. I'd strongly encourage to contact the program developer for the information about algorithms used, so we have the chance to believe that it is based on solid science.

 

Response 1: We have added information about the mechanics of how the black box was developed; we contacted the manufacturer to this end. We admit it is not much. However, we also added that KOIOS has been approved by the FDA and the EMA. Though we lack in-depth knowledge of how this approval is obtained, we believe these are thresholds that should provide some confidence as to the scientific background of the system.

We added this text in line 48: “through machine learning. It has been approved by the Food and Drug Administration and the European Medicines Agency”.

 

Point 2: Apart from that, the software is already used in diagnostics and we can't change this. So no matter how it works, it makes sense to check whether it is useful. The results show that in case of more severe BI-RADS ratings, the software can limit the number of unnecessery biopsies, but for low BI-RADS ratings it can miss some cases.

 

Response 2: We agree with you that AI is slowly entering diagnostic procedures, wether we like it or not. Our objective was to see how it performs, to check if it works. Results, as you state, are not perfect.

 

Point 3: I find Table 3 difficult to read. I think it could be better to show column charts that display percentage of malignant biopsies for each combination of BI-RADS and KOIOS score. It would allow to check for dependency of these two measures and check the joint predictive strength.

 

Response 3: We understand the main problem here is table 3. We have followed your suggestion to present results in column charts that display percentages of malignant biopsies for each combination of BI-RADS and KOIOS score.

We added Figure 1. Probability of malignant biopsies for each combination of BI-RADS and KOIOS score, in line 179.

As you say, this would in fact reflect how results would have been if combined.

We have thought how to present table 3 in a different design but have not found a better one (lack of lines are due to the editorial necessity).

 

Point 4: In Tables 1, 2 there is no description of the meaning of B (biopsies?) and M (malignant?) symbols.

 

Response 4: Corrected.

 

 

“English language and style are fine/minor spell check required”. Corrected, sent to editing facility to this end.

Author Response File: Author Response.docx

Reviewer 2 Report

Authors of this manuscript present an interesting study in which they compare a new diagnostic method based on an algorithm (AI algorithm KOIOS 12 DS TM) against ultrasound techniques (BI-RADS) and against the classic pathological results.

The issue of this manuscript fits well with the subject of the journal Diagnostics.

The text is well written and the presentation is good.

This work is the result of a meritorious international collaboration.

Introduction provides sufficient background and include enough (5) relevant references.

Cited references are fresh (21/22 from 2015 to now) and all of them are relevant to the research. There is not self-cites.

Research design and methods are clearly described.

Authors could write their conclusions in a different section than Discussion. 

Author Response

Point 1: Authors of this manuscript present an interesting study in which they compare a new diagnostic method based on an algorithm (AI algorithm KOIOS 12 DS TM) against ultrasound techniques (BI-RADS) and against the classic pathological results.

 

Response 1: Thank you very much for this comment.

 

Point 2: The issue of this manuscript fits well with the subject of the journal Diagnostics..

 

Response 2: Thank you very much for this comment.

 

Point 3: The text is well written and the presentation is good.

 

Response 3: Thank you very much for this comment.

 

Point 4: This work is the result of a meritorious international collaboration.

 

Response 4: Thank you very much for this comment.

 

Point 5: Introduction provides sufficient background and include enough (5) relevant references.

 

Response 5: Thank you very much for this comment.

 

Point 6: Cited references are fresh (21/22 from 2015 to now) and all of them are relevant to the research. There is not self-cites.

 

Response 6: Thank you very much for this comment.

 

Point 7: Research design and methods are clearly described.

 

Response 7: Thank you very much for this comment.

 

Point 8: Authors could write their conclusions in a different section than Discussion.

 

Response 8: Thank you for your suggestion, we performed.

 

 

“English language and style are fine/minor spell check required”. Corrected, sent to editing facility to this end.

Author Response File: Author Response.docx

Reviewer 3 Report

I would like to thank you for inviting me to review the manuscript entitled: "AI: Can it make a difference to the predictive value of ultra-2 sound breast biopsy?". The authors compare the pathologic results with the BI-RADS classification of images of readers against the result of processing the same images through an AI algorithm. The results presented are very interesting and well-justified. 

I would like to propose to the authors to compare the histological types of breast carcinomas with the results of human vs. AI interpretation of these lesions. It is known that some carcinomas may appear on imaging studies as benign (mucinous carcinomas, invasive breast carcinomas of no special type with a medullary pattern, etc.) In contrast, some benign lesions (radial scar/complex sclerosing lesion) may appear as suspicious for malignancy or malignant. Are these special subtypes or the above-mentioned benign lesions related to differences in the interpretation of human vs. AI, or do the interpretation differences exist in NST carcinoma as well?

Another limitation of this and every study is that since human subjects select the most representative BI-RADS image, the results of the comparison of human vs. AI will always depend on quality of the human selection.

Author Response

Thank you very much for your comments.

 

Point 1: I would like to thank you for inviting me to review the manuscript entitled: "AI: Can it make a difference to the predictive value of ultra-2 sound breast biopsy?". The authors compare the pathologic results with the BI-RADS classification of images of readers against the result of processing the same images through an AI algorithm. The results presented are very interesting and well-justified.

 

Response 1: Thank you very much for this comment.

 

Point 2: I would like to propose to the authors to compare the histological types of breast carcinomas with the results of human vs. AI interpretation of these lesions. It is known that some carcinomas may appear on imaging studies as benign (mucinous carcinomas, invasive breast carcinomas of no special type with a medullary pattern, etc.) In contrast, some benign lesions (radial scar/complex sclerosing lesion) may appear as suspicious for malignancy or malignant. Are these special subtypes or the above-mentioned benign lesions related to differences in the interpretation of human vs. AI, or do the interpretation differences exist in NST carcinoma as well?

 

Response 2: An interesting proposition. It is known that different histological types generally have different morphologies. In this study we did not want to influence the radiologist to chose an image according to the histology result, so they were blinded to histology results while using KOIOS. A much larger number of cases (due to the relative rarity of medullary or mucinous carcinomas) would be needed. Another way to put it would have been to segregate imaging findings according to BUS morphology. This was our first encounter with AI (on loan), and we wanted a fast design to see how it worked.

 

Point 3: Another limitation of this and every study is that since human subjects select the most representative BI-RADS image, the results of the comparison of human vs. AI will always depend on quality of the human selection.

 

Response 2: We agree, a fact we acknowledge due to the selection bias of the images as well as the cases themselves.

 

 

“English language and style are fine/minor spell check required”. Corrected, sent to editing facility to this end.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

The authors contacted the software developer to obtain some information regarding the implemented algorithms. The information is limited. I think it should be highlighted in text that authors could not obtain more details due to the developer's commercial secrets. The highlight that software is approved by agencies is fine.

As for table 3 - I'd rather see it as 3D column chart, i.e. on the horizontal axis we have BI-RADS score, on the axis directed perpendicular to the screen - KOIOS score, on the vertical axis - probability value (or count of cases having the attributes found on the non vertical axes).

Such presentation of data is most general: i.e. summing for given BI-RADS score all counts of the KOIOS score one obtains the count of given BI-RADS score in the sample; summing for given KOIOS score all values of BI-RADS score, one obtains the count of KOIOS score in the sample.

Having this it is possible (maybe in other research), for example to compare whether p(KOIOS value A)*p(BI-RADS value B) = p(KOIOS value A  and BI-RADS value B). If so--the measures are independent (there are statistical tests for this). Consequently - having two measures which are not equivalent allows to develop a prognostic system which is better than a prognosis based only on BI-RADS or only on KOIOS value.  However to enable somebody to elaborate this data in such way--the pairwise count of cases with particular labels of KOIOS and BI-RADS is vital.

Author Response

Response to Reviewer 1 Comments

Point 1: The authors contacted the software developer to obtain some information regarding the implemented algorithms. The information is limited. I think it should be highlighted in text that authors could not obtain more details due to the developer's commercial secrets. The highlight that software is approved by agencies is fine.

 Response 1: Thank you for this comment. We added this text in line 49: “(No more explicit information was provided by the manufacturer)”.

Point 2: As for table 3 - I'd rather see it as 3D column chart, i.e. on the horizontal axis we have BI-RADS score, on the axis directed perpendicular to the screen - KOIOS score, on the vertical axis - probability value (or count of cases having the attributes found on the non vertical axes).

Such presentation of data is most general: i.e. summing for given BI-RADS score all counts of the KOIOS score one obtains the count of given BI-RADS score in the sample; summing for given KOIOS score all values of BI-RADS score, one obtains the count of KOIOS score in the sample.

Having this it is possible (maybe in other research), for example to compare whether p(KOIOS value A)*p(BI-RADS value B) = p(KOIOS value A  and BI-RADS value B). If so--the measures are independent (there are statistical tests for this). Consequently - having two measures which are not equivalent allows to develop a prognostic system which is better than a prognosis based only on BI-RADS or only on KOIOS value.  However to enable somebody to elaborate this data in such way--the pairwise count of cases with particular labels of KOIOS and BI-RADS is vital.

Response 2: Thanks for the suggestion, we have added the 3D column chart in line 179.

Thank you for the recommendations, we will take them into account for future research.

Author Response File: Author Response.docx

Back to TopTop