Name: | Description: | Size: | Format: | |
---|---|---|---|---|
5.02 MB | Adobe PDF |
Authors
Advisor(s)
Abstract(s)
Ao longo da presente dissertação, é apresentado ao leitor todo o processo de desenvolvimento de uma
solução que visa permitir a classificação de marcos anatómicos a partir de imagens endoscópicas. Uma
solução desta natureza permite garantir que determinadas zonas do sistema digestivo foram alcançadas,
assim como em que momento tal ocorreu. Com esta informação, é possível calcular indicadores de
qualidade como, por exemplo, quanto tempo o gastroenterologista demorou a avaliar uma determinada
secção do trato digestivo, um fator crucial para determinar a qualidade de um procedimento
endoscópico. Para o desenvolvimento desta solução, foi implementada uma rede neuronal
convolucional, mais concretamente a DenseNet-121. O modelo foi criado e avaliado utilizando o dataset
Kvasir, um dataset público que contém imagens dos marcos anatómicos cego, piloro e linha Z. As
métricas avaliadas foram as taxas de acerto e a velocidade de previsão recorrendo a uma placa gráfica. O
modelo desenvolvido foi também avaliado sem a utilização da placa gráfica, de forma a avaliar a diferença
entre ambos os cenários. Verificou-se que a utilização da placa gráfica é um requisito obrigatório para a
integração do modelo desenvolvido numa solução de aquisição e processamento de vídeo em tempo
real, já que permite a redução do tempo médio de previsão de 436,67 para 17 milissegundos, uma redução
de 96,11 %. No que diz respeito às taxas de acerto, quando analisadas as classificações das imagens
pertencentes ao mesmo dataset do utilizado no treino, verificou-se que os resultados obtidos foram
bastante satisfatórios e positivos, tendo-se obtido uma taxa de acerto média na previsão de 97,33 %. No
entanto, quando incluídas imagens não pertencentes ao dataset, ou seja, que não representem marcos
anatómicos, verificou-se que a taxa de acerto desceu para os 78,75 %. Com isto, é possível concluir que
o modelo desenvolvido apresenta um problema na classificação de imagens que não representem marcos
anatómicos. Ainda assim, os resultados apresentados podem ser considerados promissores, mostrando
que uma solução desta natureza pode ser viável, mas apenas se o problema identificado for resolvido.
Para tal, sugere-se a adição de uma classe para a categorização de imagens que não representem marcos
anatómicos, o que obriga à criação de um dataset para tal.
Throughout this dissertation, the reader is presented with the entire process of developing a solution that aims to classify anatomical landmarks from endoscopic images. A solution of this nature enables us to ensure that specific areas of the digestive system have been reached, as well as when this happened. With this information, it is possible to calculate some quality indicators, such as how long the gastroenterologist took to evaluate a certain section of the digestive tract, a crucial factor in determining the quality of an endoscopic procedure. For the development of this solution, a convolutional neural network was implemented, more specifically the DenseNet-121. The model was created and evaluated using the Kvasir dataset, a public dataset that contains images of anatomical landmarks like the caecum, pylorus and Z line. The metrics evaluated were the accuracy and classification speed using a graphics card. The developed model was also evaluated without using the graphics card to evaluate the difference between both scenarios. It was verified that the use of a graphics card is a mandatory requirement for the integration of the developed model in a real-time video acquisition and processing solution, since it allows the reduction of the average prediction time from 436.67 to 17 milliseconds, a reduction of 96.11 %. In regard to the system’s accuracy, when analyzing the classifications of the images belonging to the same dataset used in training, it was verified that the results obtained were quite satisfactory and positive, having obtained an average accuracy rate in the prediction of 97.33 %. However, when images that do not belong to the dataset are included, that is, images that do not represent anatomical landmarks, it was found that the accuracy rate dropped to 78.75 %. With this, it is possible to conclude that the developed model has a problem in the classification of images that do not represent anatomical landmarks. Even so, the presented results can be considered promising, showing that a solution of this nature can be viable, but only if the identified problem is solved. To this end, it is suggested the addition of a class for the categorization of images that do not represent anatomical landmarks, which requires the creation of a dataset for this purpose.
Throughout this dissertation, the reader is presented with the entire process of developing a solution that aims to classify anatomical landmarks from endoscopic images. A solution of this nature enables us to ensure that specific areas of the digestive system have been reached, as well as when this happened. With this information, it is possible to calculate some quality indicators, such as how long the gastroenterologist took to evaluate a certain section of the digestive tract, a crucial factor in determining the quality of an endoscopic procedure. For the development of this solution, a convolutional neural network was implemented, more specifically the DenseNet-121. The model was created and evaluated using the Kvasir dataset, a public dataset that contains images of anatomical landmarks like the caecum, pylorus and Z line. The metrics evaluated were the accuracy and classification speed using a graphics card. The developed model was also evaluated without using the graphics card to evaluate the difference between both scenarios. It was verified that the use of a graphics card is a mandatory requirement for the integration of the developed model in a real-time video acquisition and processing solution, since it allows the reduction of the average prediction time from 436.67 to 17 milliseconds, a reduction of 96.11 %. In regard to the system’s accuracy, when analyzing the classifications of the images belonging to the same dataset used in training, it was verified that the results obtained were quite satisfactory and positive, having obtained an average accuracy rate in the prediction of 97.33 %. However, when images that do not belong to the dataset are included, that is, images that do not represent anatomical landmarks, it was found that the accuracy rate dropped to 78.75 %. With this, it is possible to conclude that the developed model has a problem in the classification of images that do not represent anatomical landmarks. Even so, the presented results can be considered promising, showing that a solution of this nature can be viable, but only if the identified problem is solved. To this end, it is suggested the addition of a class for the categorization of images that do not represent anatomical landmarks, which requires the creation of a dataset for this purpose.
Description
Keywords
Redes neuronais convolucionais Marcos anatómicos Indicadores de qualidade