1 Introduction

Cancer is a major cause of morbidity and a leading cause of mortality worldwide, accounting for 8.2 million deaths in 2012, with 14.1 million new cases and 32.6 million people living with cancer (diagnosed in the previous 5 years) [1]. Moreover, death from cancer is projected to rise to over 13 million by 2030 [2].

In the digestive tract, the most common cancers occur in the oesophagus, stomach and colorectum. Colorectal cancer (CRC), in particular, accounted for 1,360,000 new cases in 2012, being the third most common cancer in men (746,000 cases, 10 % of the total) and the second in women (614,000 cases, 9.2 % of the total) worldwide [1]. Furthermore oesophageal cancer accounted for 456,000 new cases worldwide in 2012 (3.2 % of the total) and stomach cancer for 952,000 new cases (6.8 % of the total) in 2012, making it the fifth most common malignancy worldwide. Differently, cancers of the small bowel are rare, representing only about 3 % of new cases per year with respect to CRC [3].

As demonstrated by the aforementioned statistics, colorectal cancers represent the most significant pathology within the gastrointestinal (GI) tract; for this reason, particular attention has been devoted to CRC in this review.

A significant aspect is related to its diagnosis and treatment. In particular, it is worth mentioning that the survival rate of CRC patients can reach almost 90 % –when diagnosis is made at an early stage– falling to less than 7 % for patients with advanced disease. Several screening tests are effective in reducing CRC incidence and/or mortality, and population screening has been rolled out in Europe and in the United States, mostly for patients older than 50 years or for those with a family history of CRC [46]. However, CRC screening programmes can be life-saving only if reliable and with high adherence, which is directly related to invasiveness and consequent discomfort (as low participation rates dilute the intrinsic efficacy of CRC screening).

To date, conventional colonoscopy is considered to be the most effective method for CRC diagnosis and it represents the gold standard for the evaluation of colonic disease due to its ability to visualise the inner surface of the colon, acquire biopsies and treat pre-neoplastic, early and stage neoplastic lesions. However, invasiveness, patient discomfort, fear of pain, and –more often than not– the need for conscious sedation limit the take-up of screening colonoscopy [7]. The technology behind standard colonoscopy consists of a long, semi-rigid insertion tube with a steerable tip (stiff if compared to the colon), which is pushed by the physician from the outside. As a result of this driving approach, scope looping occurs during the insertion phase leading to pain and potential tissue damage or even perforation (e.g., 0.1–0.3 % for diagnostic procedures in colonic tissue) [8].

On the other hand, wireless capsule endoscopy (WCE), which has been established in the last decade, represents an appealing alternative to traditional endoscopic techniques [9]. WCE enables inspection of the entire GI tract without discomfort or the need for sedation, thus avoiding many of the potential risks of conventional endoscopy. Therefore, it can encourage patients to accept GI tract examinations without concerns of pain or invasiveness. However, current WCE models are passive devices and their motion relies on natural bowel peristalsis, which implicates the risk of failing to capture images of significant regions, since the practitioner cannot control capsule/camera orientation and motion [10]. For this reason, they are commonly used for inspecting the small bowel (even if small bowel cancer is much less frequent than CRC, but not approached with standard endoscopes), seeking for sources of occult bleeding. The small bowel, in fact, has a virtual lumen that does not require insufflation for distension for a proper inspection in most cases and does not need navigation to focus on points of interests. Unlike the small bowel, the large bowel requires proper distension for inspection and navigation in order to allow visual orientation. Therefore, capsule endoscopy should integrate active motion.

When applied to the examination of the large bowel, robotic endoscopic capsules and innovative robotic endoscopes may overcome the drawbacks of pain and discomfort, but they still lack in reliability, diagnostic accuracy and –overall– fail due to their inherent inability to combine therapeutic functions with common screening aims [1113]. Furthermore, these techniques (mainly for robotic endoscopes) are often difficult to learn and master; hence, strict dependence on the operator’s skills introduces subjectivity to the procedure and consistent relevant costs for the healthcare systems willing to deliver a standardised procedure [14, 15].

2 Medical needs and clinical issues: key aspects

Technology should help in further promoting CRC screening and allow, as a consequence, a tailored and less invasive treatment. However, the aforementioned factors limit the acceptance of conventional colonoscopy-based screening protocols. For these reasons, different techniques have been proposed, such as the combination with faecal occult blood test (FOBT), in order to reduce the number of colonoscopies. Nevertheless, this approach is burdened by a high rate of false negative results [7]. Another alternative is based on computed tomography (CT) of the colon; however, it requires ideal bowel preparation and considerable X-ray exposure [7].

Direct visualisation of the colonic mucosa is preferred in order to detect subtle mucosal alterations, as in inflammatory bowel diseases, as well as any flat or sessile colonic lesions. Nevertheless, standard colon WCE shows insufficient sensitivity in detecting colonic lesions even after a major technology upgrade [16]; furthermore, the intense bowel preparation required, together with the fact that it is not possible to perform a biopsy, deprives colon WCE from getting the lead in the field [17].

An ideal diagnostic tool for the colon should provide direct visualisation and pain-free navigation through a sufficiently distended colon. This can be achieved by avoiding pressure on the bowel wall when advancing as well as avoiding extensive uncontrolled painful distension of the colon and/or loop formation. Regarding lesion visualization, the medical device should be reliable in detecting lesions at least >5 mm, which are characterized by increased potential for dysplasia, and this of course includes the areas behind bowel folds, which are often unexplored with conventional endoscopy, despite the wide angle of vision, and/or the use of transparent caps [18].

2.1 Commercial solutions: main current approaches

Wolf and Schindler are the fathers of modern GI endoscopy. They pioneered the inspection of the GI tract with semi-flexible endoscopes in 1868. Nowadays, flexible scopes are considered the mainstream endoscopic tools; they enable reliable diagnoses in the GI tract showing also therapeutic and surgical capabilities. However, since scopes are still rather rigid instruments, there are high chances of traumatic procedures, also owing to the manoeuvring mechanism, which limits patients’ tolerability and acceptance of the diagnostic technique. Moreover, pain or sedation-related issues limit the pervasiveness of a mass-screening campaign, which is a high public health priority, making patients reluctant to undergo endoscopy. Only mass screening guarantees the appropriate selection of who should undergo endoscopy to ensure early detection and treatment of asymptomatic pathologies, with particular attention to CRC.

As said, diagnosis and treatment in the GI tract are dominated by the use of flexible endoscopes. A few large companies, namely Olympus Medical Systems Co. (Tokyo, Japan), Pentax Medical Co. (Montvale, NJ, USA), Fujinon, Inc. (Wayne, NJ, USA) and Karl Storz GmbH & Co. KG (Tuttlingen, Germany), cover the majority of the market in flexible GI endoscopy. With respect to new technologies, this field is rapidly emerging, as flexible endoscopes are considered a platform for advanced diagnostic and therapeutic procedures.

In recent years, new imaging modalities aiming to enhance conventional white light endoscopy have been adopted in clinical routine and are constantly being further developed. The most prominent imaging enhancement technologies are narrow band imaging (NBI) by Olympus, i-Scan by Pentax, flexible spectral imaging colour enhancement (FICE) by Fujinon, autofluorescence imaging (AFI) by Olympus, and confocal laser endomicroscopy (CLE) by Pentax. Literature shows that enhanced imaging modalities can have an added value in the diagnosis of various pathologic entities with positive effects on accuracy, sensitivity and specificity as well as time and cost of the procedure [1923].

The field of interventional endoscopy is also constantly evolving. Recent developments, such as over-the-scope clips (OTSC®), enable endoscopists to perform more complex and radical procedures, and even spare patients from surgery through a less traumatic endoscopic intervention [24] (Fig. 1a). The FTRD® system by Ovesco Endoscopy AG (Tübingen, Germany) enables endoscopic full-thickness resection in the colorectum in an effective manner [25] (Fig. 1b). Such novel procedures extend the field of application of endoscopic devices well into the surgical domain.

Fig. 1
figure 1

a OTSC® [24]; and b FTRD® system by Ovesco Endoscopy AG [25] (Courtesy of Ovesco Endoscopy AG, Tübingen, Germany)

The FDA approval of the first WCE has led to a novel diagnostic technology in endoscopy, especially for small bowel diagnosis. At the time of the present publication, four companies dominate the WCE market. The family of PillCam® WCEs (PillCam® SB3, PillCam® Colon2, PillCam® UGI, and PillCam® PATENCY) was developed over the years leading to the first WCE introduced by Given Imaging, Ltd. (Yoqneam, Israel), and is currently marketed by Medtronic, Inc. (Dublin, Ireland). Further players in the WCE field are Olympus, Co. (Tokyo, Japan) with the EndoCapsule, IntroMedic, Co., Ltd. (Seoul, South Korea) with the MiroCam, the Chinese group Chongqing Jinshan Science & Technology, Co., Ltd. with their OMOM capsule, and CapsoVision, Inc. (Saratoga, CA, USA) with the 360° panoramic HD image CapsoCam capsule. The most representative commercial capsules are depicted in Fig. 2.

Fig. 2
figure 2

a PillCam®SB3 (Given Imaging); b PillCam®COLON2 (Given Imaging); c PillCam®UGI (Given Imaging); d PillCam®PATENCY (Given Imaging) - Courtesy of Medtronic, Inc.; e EndoCapsule (Olympus); f OMOM capsule (Chongqing Jinshan Science & Technology) - Reprinted from Intest Res 2016;14(1):21-29 with permission; g MiroCam (Intromedic); and h CapsoCam (CapsoVision)

Main target diseases are obscure GI bleeding (OGIB) and assessment or mapping for newly diagnosed Crohn’s disease (CD). Furthermore, other indications include surveillance of small intestinal polyposis syndromes or tumours as well as assessment of response and/or clinical complications of celiac disease [26, 27]. Moreover, two-headed capsules further target the oesophagus and the colon for CRC screening [16].

Latest developments that have been introduced into the market are essentially improvements of previous WCE devices, both in terms of technology (e.g., higher resolution, longer battery lifetime) and/or application (e.g., movement-sensitive control of frame acquisition frequency, changes of capsule size to promote sensitivity in colon screening). However, novel technologies need to combine the low invasiveness and high patient comfort of wireless endoscopic devices with novel, more powerful technological features in order to address widespread improvements in CRC diagnosis and treatment [28].

3 System architecture of a robotic capsule: emerging solutions enabled by microrobotic technologies

A robotic capsule platform consists of at least six primary modules: i) locomotion, ii) localization, iii) vision, iv) telemetry v) powering, and vi) diagnosis and treatment tools (Fig. 3). However, most robotic endoscopic capsules, developed to date, include only a subset of the aforementioned modules due to size constraints. Technological integration is challenging, however, thanks to current progresses in microsystem technologies and micromachining, as well as in interface and integration, modern devices can embed most of these modules and provide both diagnostic and treatment functionalities. The following subsections will illustrate the above-mentioned modules of an endoscopic capsule.

Fig. 3
figure 3

System architecture of a robotic capsule (Courtesy of Virgilio Mattoli)

3.1 Locomotion

Locomotion is a crucial aspect that must be considered when designing a robotic endoscopic capsule. WCEs can be active or passive, depending on whether they have controlled or non-controlled locomotion. Passive locomotion currently dominates the market (e.g., PillCam® WCEs). Active locomotion is still primarily at research level, but it has great potential, since it would enable the clinician to manoeuvre the device for precise targeting. However, the main issue is related to technological integration. It is difficult to embed a locomotion module into a swallowable capsule because of the size of actuation and power constraints. For instance, the power consumption of a legged capsule device is about 400 mW only for motors, consequently requiring the integration of a high capacitance and also bulky battery [29].

Two main strategies allow the implementation of active locomotion in an endoscopic swallowable capsule: one consists in embedding on-board a miniaturized locomotion system(s), i.e. internal locomotion; the other requires an external approach, i.e. external locomotion. This latter approach generally relies on magnetic field sources.

3.1.1 Internal locomotion

Different internal locomotion approaches have been investigated in literature and the most significant solutions will be presented and analysed in this paragraph.

An interesting active capsule system for gastroscopy was developed by Tortora et al. [30] (Fig. 4a - left). The submarine-like robotic capsule exploits four independent miniaturized propellers actuated by DC brushed motors; placed in the rear part of the capsule, propellers are wirelessly controlled to guarantee 3D navigation of the capsule in a water-filled stomach. An advanced version of this capsule, which embeds a camera module, has been developed by De Falco et al. [31] (Fig. 4a – right). Other possible bio-inspired approaches of swimming in a water-filled stomach cavity include flagellar or flap-based swimming mechanisms [32, 33]. Caprara et al. recently developed an innovative approach for stomach inspection that consists of a soft-tethered gastroscopic capsule; the camera capsule is oriented by means of water jets provided by a multichannel external water distribution system (Fig. 4b) [34].

Fig. 4
figure 4

Internal locomotion platforms: a Swimming capsule by Tortora et al. [30] (left) and by De Falco et al. [31] (right); b water jet-based soft-tethered capsule by Caprara et al. [34]; c earthworm-like locomotion device by Kim et al. [35, 36]; and d wired colonoscopic capsule with micro-patterned treads by Sliker et al. [39]

Several mechanisms based on internal locomotion approaches have been developed for targeting the entire intestine (i.e., large and small bowel).

A mechanism, which is bio-inspired by an earthworm-like locomotion approach, was developed by Kim et al. [35, 36] (Fig. 4c); it consists of cyclic compression/extension shape-memory alloy (SMA) spring actuators and anchoring systems based on directional micro-needles. Another bio-inspired solution for internal locomotion was proposed by Li et al. [37]. It exploits a mechanism mimicking cilia extension using six SMA actuated units, each provided with two SMA actuators for enabling bidirectional motion.

A paddling-based technique, for crawling in the intestine, was proposed by Park et al. [38]. The capsule uses multiple legs that travel from the front to the back of the capsule, in contact with the tissue, allowing directional propulsion along the lumen.

Sliker et al. developed a wired colonoscopic capsule composed of micro-patterned treads [39]. The capsule drives eight polymer treads simultaneously through one single motor (Fig. 4d). Interaction of the treads (located on the outer surface of the capsule) with the tissue guarantees propulsion of the capsule device.

Bio-inspired leg-based capsules were also developed by The BioRobotics Institute of the Scuola Superiore Sant’Anna in Italy. Increasingly sophisticated legged robot prototypes using embedded brushless motors (i.e., 4 legs [29], 8 legs [40], and 12 legs [41, 42]), were developed starting from a first generation SMA-based solution [43]. Legged capsules demonstrated effective bidirectional control, stable anchorage and adequate visualization of the lumen without the need for insufflation.

Finally, electrical stimulation of the GI muscles was proposed as a method for roughly controlling capsule locomotion or at least to stop it by generating a temporary restriction in the bowel [44, 45].

Although internal locomotion has significant advantages, such as the local distension of the tissue (i.e., no insufflation is required for accurate visualization of the lumen), it comes with a dramatic drawback: the excessive internal encumbrance needed to attain the size of an ingestible capsule (e.g., due to the presence of actuators, transmission mechanisms and high-capacity power modules).

3.1.2 External locomotion

The external locomotion approach uses permanent magnets or electromagnets and entails external field sources that interact with internal magnetic components, which are embedded in the capsule, to provide navigation and steering. The benefit of the external approach is that there are no on-board actuators, mechanisms and batteries, thanks to a small-integrated magnetic field source, i.e. in most cases a permanent magnet.

Given Imaging Ltd. investigated the use of a handheld external permanent magnetic source to navigate a capsule in the upper GI tract using a customized version of PillCam Colon, which integrates a permanent magnet, as part of the European FP6 project called “Nanobased Capsule-Endoscopy with Molecular Imaging and Optical Biopsy (NEMO project)” [46].

Carpi et al. exploited a cardiovascular magnetic navigation system (Niobe, Stereotaxis, Inc., St. Louis, MO, USA) for the robotic navigation of a magnetically modified endoscopic capsule, i.e. a PillCam SB, Given Imaging Ltd., for gastric examination [47, 48] (Fig. 5a).

Fig. 5
figure 5

External locomotion platforms: a GI tract exploration platform developed by Carpi et al. exploiting the Stereotaxis system [47, 48]; b magnetically-driven capsule with vibration by Ciuti et al. [54]; c gastric examination platform developed cooperatively by Olympus Inc. and Siemens AG Healthcare [57]; and d SUPCAM endoscopic capsule (© 2015 Lucarini G, Ciuti G, Mura M, Rizzo R, Menciassi A. Published in [59] under CC BY 3.0 license. Available from: http://dx.doi.org/10.5772/60134)

An active locomotion approach based on permanent magnets (outside and inside the capsule) was proposed by Ciuti et al. [49, 50]. The platform combines the benefits of magnetic field strength and limited encumbrance with accurate and reliable control through the use of an anthropomorphic robotic arm. Tested in a comparative study, colonoscopy using this novel robotically-driven capsule was feasible and showed adequate accuracy compared to conventional colonoscopy [51]. This approach was investigated in the framework of the FP6 European Project called “Versatile Endoscopic Capsule for GI TumOr Recognition and therapy (VECTOR project)” [52]. A significant derivative technology from the VECTOR project consisted of a soft-tethered magnetically-driven capsule for colonoscopy [53]; the device represents a trade-off between capsule and traditional colonoscopy combining the benefits of low-invasive propulsion (through “front-wheel” locomotion) with the multifunctional tether for treatment. Ciuti et al. also proposed a magnetically-driven capsule with embedded vibration mechanisms (i.e., motor with an asymmetric mass on the rotor) which allow progression of the capsule along the lumen and reduced friction [54] (Fig. 5b). Mahoney and Abbott addressed a permanent magnetic-based actuation method for helical capsules by optimizing magnetic torque while minimizing magnetic attraction [55]. The same authors demonstrated the 5-DOF manipulation of an untethered magnetic device in fluid [56].

A novel endoscopy platform for gastric examination was developed by Olympus Inc. and Siemens AG Healthcare (Erlangen, Germany). The system combines an Olympus endoscopic capsule (31 mm in length and 11 mm in diameter with two 4 frames per second (fps) image sensors) and a Siemens magnetic guidance equipment, composed of magnetic resonance imaging and computer tomography. A dedicated control interface allows the navigation of the capsule system with five degrees of freedom (i.e., 3D translation, tilting and rotation) [57] (Fig. 5c).

A hand-guided external electromagnetic system is at the basis of the robotic endoscopic platform developed during the European FP7 project called “New cost–effective and minimally invasive endoscopic device able to investigate the colonic mucosa, ensuring a high level of navigation accuracy and enhanced diagnostic capabilities (SUPCAM project)” [58, 59]. The external electromagnetic source navigates a colonoscopic spherical-shape capsule provided with an internal permanent magnet, able to perform a 360° inspection through inner camera rotation (Fig. 5d).

A significant limitation of the external magnetic locomotion approach is the difficulty in obtaining effective visualization and also locomotion in a collapsed environment. Solutions for local tissue distension were proposed by several researchers, as reported in [60, 61].

3.2 Localization

Capsule position and orientation are necessary to locate the lesions in the GI tract, determine future follow-up treatment and provide a feedback for capsule motion (in the case of active locomotion). For this reason, an accurate localization system is crucial for WCE [62]. Commercially available WCEs employ different localization strategies, e.g. Given Imaging patented a localization method in 2013 based on a single electromagnetic sensor coil [63], instead Intromedic’s localization method relies on electric potential values [64].

One of the methods used for localization consists of capturing images from the capsule: each region of the GI tract is identified by anatomical landmarks [65]. Spyrou et al. proposed an image-based tracking method using algorithms for 3D reconstruction based on the registration of consecutive frames [66].

Several research teams have instead focused on localization techniques based on magnetic fields and electromagnetic waves. Low-frequency magnetic signals can pass through human tissue without any attenuation [67], plus, magnetic sensors do not need to be in the line of sight to detect the capsule [68]. However, precision decreases if a ferromagnetic tool is unintentionally inserted into the workspace [69]; also, the size of the permanent magnet is restricted by the dimensions of the capsule, which also limits the accuracy of results [70]. Moreover, if a magnetic actuation strategy is implemented, it is possible that an undesired interference with the magnetic localization system may occur.

Weitschies et al. were the first to equip a capsule with a permanent magnet for passive capsule endoscopy [71, 72]. A 37-channel superconducting quantum interference device (SQUID) magnetometer was used to record the magnetic field distribution over the abdomen, for several time intervals. The resolution of the position was approximately a few millimetres and the temporal resolution was in the order of milliseconds. Wu et al. [73] developed a wearable tracking vest consisting of an array of Hall-Effect sensors. This was used to track a capsule provided with a Neodymium magnet. The array was around 40 cm × 25 cm × 40 cm (length × width × height) in order to cover the stomach and small intestine area of a normal human body.

Instead, Plotkin et al. [74] used a large array (8x8 matrix) of coplanar transmitting coils. At the beginning of the procedure, a complete transmitting array is sequentially activated to obtain the initial position of the receiving coil, which is enclosed inside the capsule. Only a sub-array of 8 coils is used in the following tracking stages. The authors report a 1-mm, 0.6° tracking accuracy. Another approach was proposed by Guo et al. [75] who used three external energized coils fixed on the patient’s abdomen. The coils were arranged to excite three-axes magneto-resistive sensors inside a capsule that measured the electromagnetic field strength. This method is based on the principle of magnetic dipole. The position and orientation errors reported 6.25–36.68 mm and 1.2–8.1° in the range of 0–0.4 m.

Several approaches are possible for the application of active actuation systems. The Olympus group [76] proposed a plurality of magnetic field detecting devices. They were placed on the patient’s body to detect the strength of the magnetic field in the coil of the capsule, and were induced by an external magnetic field device. The operating frequency lies in the range of 1 kHz to 1 MHz to avoid absorption of the living tissue. This technique has an accuracy of under 1 mm when the resonant circuit is placed within 120 mm from the detecting coil array.

Similar ideas were proposed by Kim et al. [77], who used magnets inside an endoscopic capsule. An external rotating magnetic field forced the capsule to rotate. Three hall-effect sensors inside the capsule were employed to measure the position and the orientation of the capsule. The authors state that the largest position detection error is less than 15 mm, and the maximum orientation detection in the pitching direction is within −4° and 15°.

Salerno et al. proposed a localization system, compatible with external magnetic locomotion, based on a triangulation algorithm. It uses a custom on–board tri–axial magnetic sensor to detect the capsule in the GI tract. Position errors reported are of 14 mm along the X axis, 11 mm along the Y axis (where X and Y are in the plane of the abdomen) and 19 mm along the Z axis. Salerno et al. [78] also developed an online localization system (working at 20 Hz) embedding a 3D Hall sensor and a 3D accelerometer with pre-calculated magnetic field maps describing the external-source magnetic field. The authors reported a position error of less than 10 mm when the localization module and the external magnet are at a distance of 120 mm.

The localization algorithm presented by Di Natali et al. [52, 79] is compatible with magnetic manipulation. It is a real-time detection strategy employing multiple sensors with a pre-calculated magnetic field map. The proposed approach showed a position detection error below 5 mm, and angular error below 19° within a spherical workspace of 15 cm in radius. The same authors proposed a Jacobian-based iterative method for magnetic localization in robotic capsule endoscopy. Overall refresh rate was 7 ms, thus enabling closed-loop control strategies for magnetic manipulation running faster than 100 Hz. The average localization error, expressed in cylindrical coordinates was below 7 mm in both the radial and axial components and 5° in the azimuthal component [80].

Electromagnetic wave methods are also used alongside these approaches. Radio frequency has been widely used for locating an object in both outdoor and indoor environments achieving an accuracy of hundreds of millimetres [81]. Given Imaging Inc. integrated this method of localization in the PillCam®SB system. Eight sensors placed in the upper abdomen receive the strength of signals emitted by the capsule. The average position error is 37.7 mm and the maximum error is 114 mm [82].

Medical practices suggest other approaches that are currently used in the clinical procedure. Among these approaches is the application of medical imaging. X-rays can also be exploited to track an object, e.g. an endoscopic capsule placed inside the digestive tract [47]. The gamma scintigraphy technique is used as well to visualize the position of the Enterion capsule, a drug-delivery-type capsule, in real time [83]. The MRI system was proposed by Dumoulin et al. to track interventional devices in real time [84].

3.3 Vision

The main purpose of a CE is of course to obtain images of the internal anatomy. Therefore, imaging capabilities, in terms of modality, sensor characteristics and illumination, are among the most important features that must be considered when designing these systems [85, 86]. A variety of solutions have been proposed featuring a range of capabilities. They will be considered in this section: i) sensor resolution, ii) sensor location, iii) field-of-view (FoV), iv) illumination, and v) modality.

3.3.1 Sensor temporal and spatial resolution

Both temporal frames per second (fps) and spatial number of pixels are important resolution criteria for imaging. Temporal resolution determines the information that the capsule can cover while travelling through the patient. If this is too low there is a risk of omitting gastrointestinal regions from the examination. Typical commercial systems originally operated at 2–3 fps only a few years ago [87], however, faster systems above 16 fps are the more recent standard [88], often with variable fps control settings.

Spatial resolution determines the quality of diagnosis that can be achieved for site analysis. This can be particularly important if the texture or appearance of the gastrointestinal surface is used for diagnosis or staging of the disease. This functionality could be particularly important for pathologies such as Barret’s oesophagus where image texture is an indicator of disease progression. The typical spatial resolution for early capsule platforms was approximately 360 × 240 pixels, however, new systems are able to achieve higher resolutions [88, 89].

3.3.2 Sensor location

The position of the sensor and lens determines the region imaged by the travelling capsule. Since most devices are designed with a predicted travel direction in the long axis of the capsule, the majority of image sensors are mounted at the tip of the capsule. More recently, capsules integrating multiple cameras have been developed and can potentially acquire images looking forward and backwards, such as PillCam® Colon 2 and UGI capsules. Similar results may also be achieved through lens design [90]. In particular, microlens arrays or lenticular lens arrays, which have been demonstrated in laparoscopic surgery [91], could be used to provide a multi-view image using a single sensor to maintain a small device footprint. Different configurations, where a side viewing sensor is used, have also been explored because looking forwards is not always the clinically optimal configuration [92]. Side viewing capabilities can potentially map the entire surrounding lumen around the capsule and ensure a continuous monitoring functionality, which can also be important for mapping algorithms.

3.3.3 Field-of-view

The limited workspace within the gastrointestinal system means that the distance between the capsule and the tissue is very small. This requires capsules to provide a wide field of view in order to observe a sufficient image of the tissue walls. Typically the FoV of capsules ranges between 140° and 170°. However, different setups have been explored, for example the CapsoCam capsule presents a new concept with a 360° panoramic lateral view with four cameras [92].

3.3.4 Illumination

Image quality is inherently governed by the illumination and sensor capabilities of the endoscopic capsule. Illumination is typically provided by LED sources configured to provide white light images, which are most commonly used for interpretation by the physician. Adaptive illumination strategies have recently been under investigation for achieving optimal image quality while preserving battery power based on image processing [93]. Different strategies can be employed to conserve power, e.g. the brightness of the image can be used to estimate the distance from the surface under observation because illumination power is a function of distance. The overall image brightness can then be adapted to maintain diagnostic image quality.

3.3.5 Modality

White light (WL) is the main modality for WCE imaging because it is the most well understood signal for interpretation by physicians. Nevertheless, molecular imaging has been explored by a number of teams and projects [94]. Autofluorescense capsule prototypes have been explored [95] for potentially detecting disease without an on-board camera [96, 97]. NBI potentially exposes useful subsurface vessel information that can characterize disease [98]; details are reported in previous paragraphs. The type of light and the 2D or 3D image modalities of the capsule have also been explored. It is possible that multiple 2D images can potentially offer a 3D reconstruction of the video, which could allow more accurate lesion classification [99102].

Stereoscopic systems have already been developed, such as the one developed in 2013 by Simi et al. for laparoscopic procedures [103]. A new concept of capsule with a stereo camera system for colonoscopy will be investigated by the authors within the EndoVESPA EU project [104]. Modalities that can penetrate deeper within tissue walls are also under exploration, for example in the UK SonoPill project [10]. It is likely that new methods for manufacturing microarrays for sensing ultrasound will play an important role in enabling such sensing capabilities [105].

3.4 Telemetry

How to transmit and receive data is a central topic in WCE technology. A high data rate telemetry system is essential to allow high-resolution imaging. Due to size constraints and technological limitation of wireless communication, the telemetry subsystem is often a bottleneck in capsule design. Robotic endoscopic capsules can employ radiofrequency transmission, human body communication, or can also integrate a data storage system, thus avoiding wireless communication [106]. Human body communication technology uses the human body as a conductive medium. It requires less power than radiofrequency communication, but involves a large number of sensor electrodes on the skin [107].

Wireless capsules using radiofrequency communication are attractive because of their efficient transmission through the layers of the skin. This is especially true for low frequency transmission (UHF-433 ISM and lower) [108]. However, low frequency transmission requires large electronic components. Given Imaging’s capsules embed a Zarlink’s transceiver, and transmit 2.7 Mb/s at 403–434 MHz [11]. A part of its recent research focuses on developing impulse radio ultra-wideband antennas (IR-UWB) for WCE [109111]. A CMOS system employing an ON-OFF keying modulation, with a superetherodyne receiver, is presented in [112]. A low power transmitter working in the ISM 434 MHz band is discussed in [113]. It is designed using CMOS 0.13 μm technology and consumes 1.88 mW.

3.5 Powering

Power management is a major challenge in WCE, because of dimension constraints and high consuming components, such as LEDs [114]. Capsules are usually powered by silver oxide button batteries. Two or three of them allow up to 15 h operation [106]. Lithium ion polymer batteries, as well as thin film batteries, are promising solutions to enhance power density and reduce battery dimensions [11].

Wireless power transfer has also been investigated. In particular, RF power transfer is highly suitable for medical devices, since it is non-invasive and non-ionizing [114]. An inductive power system, operating at 1 MHz and able to supply 300 mW, is presented in [115]. A portable magnetic power transmission system is demonstrated in [116]. The system was also tested on a pig and showed an energy conversion efficiency of 2.8 %. An inductive-based wireless recharging system is presented in [117]. It can provide up to 1 W power and is able to recharge a VARTA CP 1254 battery in 20 min. In [118] an analytical comparison among simple solenoid, pair of solenoids, double-layer solenoids, segmented-solenoid, and Helmholtz power transmission coils (PTCs) is carried out with a FEM simulation. It shows that the segmented solenoid PTC can transfer the maximum amount of power.

3.6 Diagnosis and treatment

Progress in micro-electromechanical systems (MEMS) technologies have led to the development of new endoscopic capsules with enhanced diagnostic capabilities, in addition to traditional visualization of mucosa (embedding, e.g. pressure, pH, blood detection and temperature sensors). However, current capsule endoscopes lack treatment module(s), thus requiring a subsequent traditional endoscopic procedure. Developing clinical capsules with diagnostic, interventional and therapeutic capabilities, such as biopsy sampling, clip release for bleeding control, and/or drug delivery, will allow WCE to become the mainstream endoscopic mode. This section is divided into two categories: i) diagnosis, and ii) treatment, as reported below.

3.6.1 Diagnosis

With regard to image-based diagnosis and derived algorithms for enhanced diagnosis, even if a full 3D map of the distance covered by the diagnostic imaging device is not achieved, 3D surface shape at specific time instants provides important diagnostic information. In particular, these shape cues can be used to identify polyp structures; indeed methods for automatically identifying and analysing them have been developed [119, 120]. While 3D methods based on shading or multiple views are interesting, the most clinically relevant advances for computational processing of WCE images have been based on 2D data. Specifically, automated abnormality detection and highlighting have been investigated and proposed [121123]. These methods benefit significantly from recent developments in machine learning and especially convolutional neural networks which have shown to be highly capable of addressing detection and segmentation problems. Training data for machine learning methods is still a challenge, however, the community is moving towards addressing this issue, for example with the recent EndoVis challenge at MICCAI 2015 [124].

With regard to enhanced diagnostic inference from capsule images, various methods have been proposed to enhance the image; however, none have yet taken into account strong appearance and surface tissue shape priors. Enhancing image quality and information is critical to reduce the chances of missing potential adenocarcinomas (currently estimated around 6 %) [125, 126]. Quick view function is important for allowing practical analysis of capsule videos, which can be lengthy [127].

Apart from image enhancement, powerful diagnostic accuracy molecular imaging in gastrointestinal endoscopy has recently emerged as an exciting technology encompassing different modalities that can visualize disease-specific morphological or functional tissue changes based on the molecular signature of individual cells [87, 128].

However, it is worth mentioning that a huge number of endoscopic capsules have been designed with embedded sensing capabilities, most of them already available on the market.

Gonzalez-Guillaumin et al. [129] developed a wireless capsule for the detection of gastro-oesophageal reflux disease. It embeds impedance and pH sensors, and uses a magnetic holding solution for surgical fixation. Johannessen et al. developed a wireless multi-sensor system (Lab-in-a-Pill capsule) equipped with a control chip, a transmitter, and sensors for pH, conductivity, temperature and dissolved oxygen [130]. The FDA-approved Bravo capsule for pH monitoring (produced by Given Imaging Ltd.) is a device for evaluating gastroesophageal reflux disease [131].

The OMOM pH capsule (JinShan Science & Technology Co., Ltd., Chonqing, China) is a wireless pH monitoring system not approved by FDA. It needs to be anchored in the oesophagus and is able to transmit pH data to an external recorder. The SmartPill capsule by Given Imaging Ltd. is a FDA-approved capsule for assessment of GI motility. It measures several physiological parameters (e.g., pH, pressure and temperature) while travelling through the GI tract [132]. CorTemp by HQ Inc. (Palmetto, FL, USA) is a FDA-approved capsule for internal body temperature measurement, with ±0.1 °C accuracy (Fig. 6a). The working principle is based on a quartz crystal, embedded into the capsule, whose vibration frequency varies with temperature. Consequently, a magnetic flux is established, and a low-frequency signal flows through the body [133, 134].

Fig. 6
figure 6

a CorTemp by HQ Inc. [133, 134] (Palmetto, FL, USA); and b HemoPill acute, Ovesco Endoscopy AG (Tübingen, Germany) [136]

In the attempt to provide less-invasive procedures for imaging of the GI tract, Check-Cap (Mount Carmel, Israel) is a wireless capsule, which uses low-dose radiation to obtain the 3D reconstructed image of the colon. It could mitigate the need for bowel preparation [135].

The HemoPill acute by Ovesco Endoscopy AG (see Fig. 6b) is a wireless capsule able to detect acute bleeding in the upper GI tract. It contains an optical sensor able to detect blood in the organ content in concentrations as low as 1 % [136].

3.6.2 Treatment

Several research groups developed endoscopic capsules with embedded modules for GI tract treatment. Valdastri et al. developed a capsule for treating bleeding in the GI tract, which is able to electrically release an endoscopic clip (Fig. 7a) [137].

Fig. 7
figure 7

a Therapeutic wireless endoscopic capsule with an endoscopic clip for treating bleeding in the GI tract produced by Valdastri et al. [137]; b Magnetic-driven biopsy capsule produced by Simi et al. [140]; c Therapeutic capsule for bioadhesive patch release produced by Quaglia et al. [142]; d Soft-tethered therapeutic capsule colonoscope developed by Valdastri et al. [146]; e and f Capsule for photodynamic therapy of Helicobacter pylori bacterium by Tortora et al. [145]

Kong et al. developed a wireless capsule for biopsy. It consists of a rotational tissue cutting razor fixed to a torsional spring, constrained by a paraffin block [138]; a more advanced version of the device was developed and presented in [139].

Simi et al. developed a wireless capsule for biopsy. It employs magnetic fields for stabilization of the capsule and enables reliable sampling during the biopsy (Fig. 7b) [140].

A novel biopsy method using deployable microgrippers released in the stomach from a capsule robot has been proposed by Yim et al. [141]. The capsule is positioned in the stomach by means of a magnetic system. The microgrippers fold and collect small biopsies when triggered by body heat. The capsule then collects the microgrippers using a wet-adhesive patch.

The system developed by Quaglia et al. (Fig. 7c) [142] exploits a spring mechanism based on SMA for unlocking a bench compressed by a super elastic structure in order to release a bioadhesive patch. Several other robotic capsules for drug delivery have been developed by different research groups. A significant example is the one developed by Woods et al. that consists of a micro-positioning mechanism for targeted drug delivery and a holding mechanism used for resisting against peristaltic contractions [143]. An exhaustive review of drug delivery systems for capsule endoscopy has been written by Munoz et al. [144].

Tortora et al. developed a capsule for the photodynamic therapy of Helicobacter pylori bacterium consisting in a swallowable device including LEDs and a battery and able to deliver light at specific wavelengths the Helicobacter is sensitive to [145]. A robotic capsule able to provide several therapeutic functionalities was developed by Valdastri et al. (Fig. 7d) [146]. The soft-tethered capsule colonoscope features a compliant multilumen tether for suction, irrigation, insufflation or access for standard endoscopic tools (e.g., polypectomy snares, biopsy forceps, retrieval baskets and graspers). Gorlewicz et al. proposed a method for obtaining tissue distension. It consists of a tetherless insufflation system, based on a controlled phase transition of a small volume of fluid (stored on-board the capsule) to a large volume of gas, emitted into the intestine [60].

Finally, artificial touch is an enabler of research progression towards minimally invasive surgery (MIS) in medical robotics, with particular respect to operational safety, automation of interventional procedures, capability to reproduce haptic feedback and characterization of tissues for diagnostic purposes [147, 148]. Over the last years, tactile sensing has demonstrated major breakthroughs in the domain of hand neuroprosthetics [149155], and there is relevant literature showing the benefits (e.g., considering duration and effectiveness of operations) of force and tactile sensing technologies as valuable tools in robot–assisted surgery [156]. Various research projects have addressed the integration of the sense of touch in surgical or diagnostic tools [157, 158] and shown the feasibility of using artificial touch for tumour localization [159162]. However, the integration of tactile sensing in robotic tools for medicine is still an open research topic, requiring several advances prior to clinical application and socio–economic impact. Furthermore, though endoscopic or MIS tools endowed with tactile sensorization have been developed [163], only very recent and preliminary technologies [164], within the framework of the EndoVESPA EU project, integrate an artificial sense of touch in dedicated tools for robotic capsules [165], mainly as a consequence of miniaturization constraints.

4 Capsule endoscopy patents

A large number of patents of capsules for digestive endoscopy have been filed worldwide, underlining the interest for these novel devices and field of medical application. It is not possible to provide an exhaustive and detailed patents’ analysis in this manuscript, but the main topics of interest will be highlighted in order to understand, in some cases, the industrial trends.

Inventors and companies are particularly trying to improve the features and techniques of these challenging devices, which require high technology and particular attention to patients’ comfort and physicians’ requests.

For these reasons, over the last years different aspects of this topic have been studied and developed, such as: i) wireless capsule, ii) magnetic guidance, iii) imaging, iv) power source, v) energy management, vi) localization and locomotion mechanism, vii) drug delivery, and vii) biopsy. Different companies, such as Olympus, Co., Given Imaging, Ltd, Siemens AG, and university research groups have invested time and resources to develop new ideas for these devices and achieve interesting technological solutions.

Some of these are dedicated to improve imaging information obtained from cameras located on the capsule. They use thermal imaging cameras [166] or an internal radiation unit that detects radioactivity drugs injected into the body [167]. In fact, imaging processing makes it possible to increase contrast in the gastrointestinal tract for particular pathologies and morphologies. For example, the use of infrared or other frequencies of the electromagnetic spectrum allows details that are not visible through the spectrum to be analysed. Physicians can thus improve their diagnosis and better respond to disease evolution. Another area under examination is the optical section, which includes lenses and image sensors (e.g., CMOS, CCD). Recent developments have focused on enhancing the image captured from the capsule by using multiple image sensors for a spherical view [168] or on using images sensors at the front end and the rear end of the capsule [169]. The use of ultrasound and Doppler principles is a challenging topic for engineers and researchers, allowing them to incorporate these technologies within the capsules and generate different kind of images and details of the examined tract [170].

The majority of patents on capsule endoscopy focuses on new methods for performing locomotion and tracking the position of the capsule. More specifically, they describe new techniques that employ magnetic guidance - leading to magnetic interaction and capsule motion management improvement [171, 172] - or that use an ultrasound positioning system [173]. Power efficiency is another topic that is under examination, especially for wireless capsules that need to reduce energy consumption and guarantee all features and functions over the entire duration of the exam (e.g., self-charging method for the charging of a power source by an external electric field) [174].

Moreover biopsy mechanisms have been developed to permit the capsule to collect tissues samples and so improve exam efficiency and complexity. This is achieved by using electro-mechanical solutions to collect and store the sample inside the capsule [175, 176].

All these new features and ideas are interesting and some of them promise to bring real and tangible aid to the evolution of endoscopy with clear advantages for both patients and clinicians.

5 Conclusion

Although the introduction of WCE in clinical practice at the start of the millennium led to shock waves of change in the field of GI endoscopy, over the last few years, progression has significantly slowed down with respect to the research advancements and thus expectations: this is indirectly even demonstrated by the fact that, since 2009, the number of new patent applications are decreasing [177]. Since the appearance of the first capsule endoscope, several IT and robotics research groups around the globe have proposed a variety of methods, including algorithms for detecting haemorrhage and lesions, reducing review time, localizing the capsule or lesion, assessing intestinal motility, providing wireless endoscopic capsule control through accurate magnetic models, locomotion and therapy, and enhancing video quality. Even though research is prolific (as measured by publication activity), the technological industrial-oriented progress made during the past 5 years can only be considered as marginal (with respect to clinical needs and research-oriented outcomes). Nevertheless, WCE has the potential to become the leading screening, diagnostic and therapeutic technique for the entire GI tract [102]. Moreover, the use of a robotic miniaturized device that promises to offer targeted therapy (e.g., used as a smart active carrier to drug delivery) has been a long-term fascination and – why not – unfulfilled dream of the medical profession and the patients alike.

For a device to create the next innovative robotic solution for non-invasive diagnosis and therapy in the research field of WCE, the aforementioned modules (e.g., powering, telemetry, diagnosis and treatment) should be addressed and properly integrated. Several similarities could be drawn from the field of medicine/gastroenterology and other sciences as well bio-mimetic/bio-inspired approaches, such as the spider/insect, worm and fish-like capsules that are promising approaches not only with respect to navigation within the digestive tract, but also for treatment (e.g., haemostatic clips or drug delivery, biopsies or small dissections) [10].

Chimeric devices that combine the best of both worlds, i.e. conventional and wireless GI endoscopy, seem a promising next step (as in the EndoVESPA EU project [104]). Therefore, following this approach, we believe that GI endoscopy in the third decade of the new millennium will become a success story of screening efficacy and minimal –if any– discomfort. This should be provided by an enhanced version of the capsule-based platform and allied technologies, which should allow, e.g. improved image-based capabilities with assistive algorithms [178], and active locomotion [49]. As aeronautical engineering –for more than a century now– has not significantly moved away from the conceptual design/idea of the aviation pioneers, the external capsule-like shell of the device - with optimizations of materials and shape [10, 179] - will not change drastically over time. Instead, the speed (and accurate control of the device), the functional characteristics (image definition, illumination, 3D reconstruction, tactile sensing and therapeutic embedded tools for targeted therapy and in situ drug delivery) and the indications for obtaining it will change over time. We believe that the era of assistance or – in some extreme cases – automation (in diagnosis and therapy), an era of universal, equitable, high-quality GI endoscopy is finally here. Are we going to stay back?