ارائه یک روش جدید ناوبری خودگردان کپسول درون‌بین مبتنی بر یادگیری تقویتی عمیق با بهینه‌سازی سیاست مجاور در یک محیط مجازی

قهرمانی, حمیدرضا; جوهری مجد, وحید

doi:10.22034/abmir.2025.23028.1122

	ارائه یک روش جدید ناوبری خودگردان کپسول درون‌بین مبتنی بر یادگیری تقویتی عمیق با بهینه‌سازی سیاست مجاور در یک محیط مجازی
پژوهش های نظری و کاربردی هوش ماشینی
مقاله 1، دوره 3، شماره 1، شهریور 1404، صفحه 1-15 اصل مقاله (934.94 K)
نوع مقاله: مقاله پژوهشی
شناسه دیجیتال (DOI): 10.22034/abmir.2025.23028.1122
نویسندگان
حمیدرضا قهرمانی¹؛ وحید جوهری مجد^* ²
¹دانشجوی دکترای دانشکده مهندسی برق و کامپیوتر، دانشگاه تربیت مدرس، تهران، ایران
²دانشیار دانشکده مهندسی برق و کامپیوتر، دانشگاه تربیت مدرس، تهران، ایران
چکیده
با توجه به نگرانی‌های عمومی از عوارض درون‌بینی سنتی، تحقیقات در زمینه استفاده از کپسول‌های درون‌بین به عنوان روشی کمتر تهاجمی مورد توجه قرار گرفته است. اما حرکت غیرفعال کپسول باعث عدم دسترسی به زوایای مدنظر پزشک می‌شود. برای رفع این محدودیت، یک رویکرد نوین ناوبری خودگردان مبتنی بر یادگیری تقویتی عمیق با استفاده از الگوریتم بهینه‌سازی سیاست مجاور ارائه شده است تا فرآیند موقعیت‌یابی، مسیریابی و کنترل حرکت کپسول را به‌صورت اتوماتیک انجام دهد. در این روش با ادغام داده‌های چندوجهی حسگرها، نقطه هدف در طول زمان تخمین زده می‌شود. از آنجا که آموزش اولیه الگوریتم نیاز به داده‌های فراوانی دارد یک محیط مجازی نزدیک به واقعیت برای آموزش عامل هوشمند شامل مدل عملگر با ساختاری از سیم‌پیچ‌های مغناطیسی، کپسولی مجهز به آهنربای دوقطبی، دوربین، حسگر اینرسی، و مدل سه‌بعدی روده بزرگ، فراهم شده است. هدف اصلی در این پژوهش، کاهش مداخله‌های عملیاتی اپراتور جهت تمرکز بیشتر بر جنبه‌های بالینی و پزشکی درون‌بینی است. روش ارائه‌شده با ابَرمتغیرهای مختلفی آموزش داده شده و نتایج آن بر اساس شاخص‌های حرکت و هم‌جهتی به سمت هدف و آنتروپی مقایسه شده است. ارزیابی نتایج نشان می‌دهد که با تنظیم بهینه اندازه بافر و مقدار دسته، الگوریتم عملکرد مناسبی در ردیابی و پایداری دارد.
کلیدواژه‌ها
ناوبری خودگردان؛ کپسول درون‌بین؛ یادگیری تقویتی عمیق؛ بهینه‌سازی سیاست مجاور؛ محیط مجازی
عنوان مقاله [English]
A novel autonomous navigation of capsule endoscopy based on deep reinforcement learning with proximal policy optimization in a virtual environment
نویسندگان [English]
Hamidreza Ghahremani¹؛ Vahid Johari Majd²
¹Ph.D. Student, School Electrical and Computer Engineering, Tarbiat Modares University, Tehran, Iran
²Associate Professor, School of Electrical and Computer Engineering, Tarbiat Modares University, Tehran, Iran
چکیده [English]
Given widespread concerns about the complications associated with traditional endoscopy, research into the use of less invasive endoscopic capsules has gained significant attention. However, the passive movement of these capsules often prevents access to specific angles and areas of interest to the clinician. To overcome this limitation, we propose a novel autonomous navigation approach based on deep reinforcement learning, utilizing a proximal policy optimization (PPO) algorithm. This method automates the capsule's positioning, pathfinding, and motion control. Our approach integrates multi-modal sensor data to estimate the target point over time. Recognizing that initial algorithm training requires substantial data, we developed a near-realistic virtual environment. This environment facilitates the training of an intelligent agent and includes an actuator model with a magnetic coil structure, a capsule equipped with a dipole magnet, a camera, an inertial sensor, and a 3D model of the large intestine. The primary objective of this research is to reduce operator intervention, allowing clinicians to focus more on the clinical and medical aspects of endoscopy. The proposed method was trained with various hyperparameters, and its performance was evaluated based on metrics such as movement and alignment toward the target, as well as entropy. The evaluation results demonstrate that by optimally adjusting the buffer size and batch size, the algorithm achieves effective tracking and stability.
کلیدواژه‌ها [English]
Autonomous navigation, capsule endoscopy, deep reinforcement learning, neighborhood policy optimization, virtual environment

مراجع
[1] Bretthauer, M., et al., Effect of colonoscopy screening on risks of colorectal cancer and related death. New England Journal of Medicine, 2022. 387(17): p. 1547-1556. [2] Scaglioni, G., et al., Facing the emotional barriers to colorectal cancer screening. The roles of reappraisal and situation selection. International journal of behavioral medicine, 2024: p. 1-10. [3] Cao, Q., et al., Robotic wireless capsule endoscopy: recent advances and upcoming technologies. Nature Communications, 2024. 15(1): p. 4597. [4] Ali, M.A., et al., Recent Advancements in Localization Technologies for Wireless Capsule Endoscopy: A Technical Review. Sensors (Basel, Switzerland), 2025. 25(1): p. 253. [5] Hoang, M.C., et al., Independent electromagnetic field control for practical approach to actively locomotive wireless capsule endoscope. IEEE TranSACtions on Systems, Man, and Cybernetics: Systems, 2019. 51(5): p. 3040-3052. [6] Zhang, H., et al., Towards Automatic Stomach Screening Using a Wireless Magnetically Actuated Capsule Endoscope. IEEE TranSACtions on Medical Robotics and Bionics, 2024. [7] Martin, J.W., et al., Enabling the future of colonoscopy with intelligent and autonomous magnetic manipulation. Nature machine intelligence, 2020. 2(10): p. 595-606. [8] Pore, A., et al. Colonoscopy navigation using end-to-end deep visuomotor control: A user study. IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). 2022. [9] Zhang, Y., et al. Deep reinforcement learning-based control for stomach coverage scanning of wireless capsule endoscopy. IEEE International Conference on Robotics and Biomimetics (ROBIO). 2022. [10] Iriondo, A., et al., Pick and place operations in logistics using a mobile manipulator controlled with deep reinforcement learning. Applied Sciences, 2019. 9(2): p. 348. [11] Juliani, A., et al., Unity: A general platform for intelligent agents. arXiv preprint arXiv:1809.02627, 2018. [12] Cheliotis, K., ABMU: an agent-based modelling framework for Unity3D. SoftwareX, 2021. 15: p. 100771. [13] Schulman, J., et al., Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017. [14] Tao, X., et al., A Fast and Robust Camera-IMU Online Calibration Method For Localization System. arXiv preprint arXiv:2305.08247, 2023. [15] İncetan, K., et al., VR-Caps: a virtual environment for capsule endoscopy. Medical image analysis, 2021. 70: p. 101990. [16] Xu, Y., et al., Trajectory following strategies for wireless capsule endoscopy under reciprocally rotating magnetic actuation in a tubular environment. arXiv preprint arXiv:2108.11620, 2021. [17] Müller, A., F. Grumbach, and M. Sabatelli. Smaller Batches, Bigger Gains? Investigating the Impact of Batch Sizes on Reinforcement Learning Based Real-World Production Scheduling. IEEE 29th International Conference on Emerging Technologies and Factory Automation (ETFA). 2024.
آمار تعداد مشاهده مقاله: 250 تعداد دریافت فایل اصل مقاله: 120

سامانه مدیریت نشریات علمی دانشگاه یزد

ارائه یک روش جدید ناوبری خودگردان کپسول درون‌بین مبتنی بر یادگیری تقویتی عمیق با بهینه‌سازی سیاست مجاور در یک محیط مجازی