ESS - NEU - Neurofisiologia
Permanent URI for this community
Browse
Browsing ESS - NEU - Neurofisiologia by Author "Beniczky, Sándor"
Now showing 1 - 3 of 3
Results Per Page
Sort Options
- Comparative evaluation of artificial intelligence chatbots in answering electroencephalography-related questionsPublication . Proença, Soraia; Soares, Joana Isabel; Parra, Joana; Maia, Gisela; Leite, Juliana; Beniczky, Sándor; Jesus-Ribeiro, Joana; Henrique Maia, Gisela MariaAs large language models (LLMs) become more accessible, they may be used to explain challenging EEG concepts to nonspecialists. This study aimed to compare the accuracy, completeness, and readability of EEG-related responses from three LLM-based chatbots and to assess inter-rateragreement. One hundred questions, covering 10 EEG categories, were entered into ChatGPT, Copilot, and Gemini. Six raters from the clinical neurophysiology field (two physicians, two teachers, and two technicians) evaluated the responses. Accuracy was rated on a 6-point scale, completeness on a 3-point scale, and readability was assessed using the Automated Readability Index (ARI). We used a repeated-measures ANOVA for group differences in accuracy and readability, the intraclass correlation coefficient (ICC) for inter-raterreliability, and a two way ANOVA, with chatbot and raters as factors, for completeness. Total accuracy was significantly higher for ChatGPT (mean ± SD 4.54 ± .05) compared with Copilot (mean ± SD 4.11 ± .08) and Gemini (mean ± SD 4.16 ± .13) (p < .001). ChatGPT's lowest performance was in normal variants and patterns of uncertain significance (mean ± SD 3.10 ± .14), while Copilot and Gemini performed lowest in ictal EEG patterns (mean ± SD 2.93 ± .11 and 3.37 ± .24, respectively). Although inter-rater agreement for accuracy was excellent among physicians (ICC = .969) and teachers (ICC = .926), it was poor for technicians in several EEG categories. ChatGPT achieved significantly higher completeness scores than Copilot (p < .001) and Gemini (p = .01). ChatGPT text (ARI − mean ± SD 17.41 ± 2.38) was less readable than Copilot (ARI −mean ± SD 11.14 ± 2.60) (p < .001) and Gemini (ARI − mean ± SD 14.16 ± 3.33). Chatbots achieved relatively high accuracy, but not without flaws, emphasizing that the information provided requires verification. ChatGPT outperformed the other chatbots in accuracy and completeness, though at the expense of readability. The lower inter-rater agreement among technicians may reflect a gap in standardized training or practical experience, potentially impacting the consistency of EEG-related content assessment.
- The sound of silence: Quantification of typical absence seizures by sonifying EEG signals from a custom‐built wearable devicePublication . Borges, Daniel Filipe; Fernandes, João; Soares, Joana Isabel; Casalta‐Lopes, João; Carvalho, Daniel; Beniczky, Sándor; Leal, AlbertoObjective: To develop and validate a method for long- term (24- h) objective quantification of absence seizures in the EEG of patients with childhood absence epilepsy (CAE) in their real home environment using a wearable device (waEEG), comparing automatic detection methods with auditory recognition after seizure sonification. Methods: The waEEG recording was acquired with two scalp electrodes. Automatic analysis was performed using previously validated software (Persyst® 14) and then fully reviewed by an experienced clinical neurophysiologist. The EEG data were converted into an audio file in waveform format with a 60- fold time compression factor. The sonified EEG was listened to by three inexperienced observers and the number of seizures and the processing time required for each data set were recorded blind to other data. Quantification of seizures from the patient diary was also assessed. Results: Eleven waEEG recordings from seven CAE patients with an average age of 8.18 ± 1.60 years were included. No differences in the number of seizures were found between the recordings using automated methods and expert audio assessment, with significant correlations between methods (ρ > .89, p < .001) and between observers (ρ > .96, p < .001). For the entire data set, the audio assessment yielded a sensitivity of .830 and a precision of .841, resulting in an F1 score of .835. Significance: Auditory waEEG seizure detection by lay medical personnel provided similar accuracy to post- processed automatic detection by an experienced clinical neurophysiologist, but in a less time- consuming procedure and without the need for specialized resources. Sonification of long- term EEG recordings in CAE provides a user- friendly and cost- effective clinical workflow for quantifying seizures in clinical practice, minimizing human and technical constraints.
- The sound of silence: Quantification of typical absence seizures by sonifying EEG signals from a custom‐built wearable devicePublication . Borges, Daniel Filipe; Fernandes, João; Soares, Joana Isabel; Casalta‐Lopes, João; Carvalho, Daniel; Beniczky, Sándor; Leal, AlbertoTo develop and validate a method for long-term (24-h) objective quantification of absence seizures in the EEG of patients with childhood absence epilepsy (CAE) in their real home environment using a wearable device (waEEG), comparing automatic detection methods with auditory recognition after seizure sonification. The waEEG recording was acquired with two scalp electrodes. Automatic analysis was performed using previously validated software (Persyst® 14) and then fully reviewed by an experienced clinical neurophysiologist. The EEG data were converted into an audio file in waveform format with a 60-fold time compression factor. The sonified EEG was listened to by three inexperienced observers and the number of seizures and the processing time required for each data set were recorded blind to other data. Quantification of seizures from the patient diary was also assessed. Eleven waEEG recordings from seven CAE patients with an average age of 8.18 ± 1.60 years were included. No differences in the number of seizures were found between the recordings using automated methods and expert audio assessment, with significant correlations between methods (ρ > .89, p < .001) and between observers (ρ > .96, p < .001). For the entire data set, the audio assessment yielded a sensitivity of .830 and a precision of .841, resulting in an F1 score of .835. Auditory waEEG seizure detection by lay medical personnel provided similar accuracy to post-processed automatic detection by an experienced clinical neurophysiologist, but in a less time-consuming procedure and without the need for specialized resources. Sonification of long-term EEG recordings in CAE provides a user-friendly and cost-effective clinical workflow for quantifying seizures in clinical practice, minimizing human and technical constraints.
