Personal and Ubiquitous Computing

, Volume 22, Issue 4, pp 751–770 | Cite as

SEQUENCE: a remote control technique to select objects by matching their rhythm

  • Alessio BellinoEmail author
Original Article


We present SEQUENCE, a novel interaction technique for selecting objects from a distance. Objects display different rhythmic patterns by means of animated dots, and users can select one of them by matching the pattern through a sequence of taps on a smartphone. The technique works by exploiting the temporal coincidences between patterns displayed by objects and sequences of taps performed on a smartphone: if a sequence matches with the pattern displayed by an object, the latter is selected. We propose two different alternatives for displaying rhythmic sequences associated with objects: the first one uses fixed dots (FD), the second one rotating dots (RD). Moreover, we performed two evaluations on such alternatives. The first evaluation, carried out with five participants, was aimed to discover the most appropriate speed for displaying animated rhythmic patterns. The second evaluation, carried out on 12 participants, was aimed to discover errors (i.e., activation of unwanted objects), missed activations (within a certain time), and time of activations. Overall, the proposed design alternatives perform in similar ways (errors, 2.8% for FD and 3.7% for RD; missed, 1.3% for FD and 0.9% for RD; time of activation, 3862 ms for FD and 3789 ms for RD).


Interaction techniques Rhythm matching Touch remote control Touchless remote control 



Many thanks to Giorgio De Michelis for his comments and pieces of advice on the initial and revised version of the paper, and Daniela Bascuñán for helping me to improve English. Thanks also to my daughter Eleonora, who was born during the revision period of this paper. Despite that, she allowed me to work on it and send it before the deadline. Finally, thanks to friends and colleagues who gave me suggestions and discussed critically the different alternatives of design of SEQUENCE.


  1. 1.
    Ackad C, Clayphan A, Tomitsch M, Kay J (2015) An in-the-wild study of learning mid-air gestures to browse hierarchical information at a large interactive public display. Proceedings of the Joint International Conference on Pervasive and Ubiquitous Computing and the International Symposium on Wearable Computers (Ubicomp/ISWC’15): 1227–1238.
  2. 2.
    Baldauf M, Fröhlich P (2013) The augmented video wall: multi-user AR interaction with public displays. Proceedings of the ACM SIGCHI Conference on Human Factors in Computing Systems (CHI’13): 3015–3018.
  3. 3.
    Bar M, Neta M (2006) Humans prefer curved visual objects. Psychol Sci 17(8):645–648. CrossRefGoogle Scholar
  4. 4.
    Barbara N, Camilleri TA (2017) Interfacing with a speller using EOG glasses. In 2016 I.E. International Conference on Systems, Man, and Cybernetics, SMC 2016 - Conference Proceedings, 1069–1074.
  5. 5.
    Baudel T, Beaudouin-lafon M (1993) CHARADE : remote control of objects using free-hand gestures. Commun ACM 36(7):28–35. CrossRefGoogle Scholar
  6. 6.
    Bellino A (2015) Enhancing pinch-drag-flick paradigm with two new gestures: two-finger-tap for tablets and tap&tap for smartphones. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 534–551.
  7. 7.
    Bellino A, Cabitza F, De Michelis G, De Paoli F (2016) Touch&Screen. In Proceedings of the 15th International Conference on Mobile and Ubiquitous Multimedia - MUM ‘16, 25–35.
  8. 8.
    Boring S, Jurmu M, Butz A (2009) Scroll, tilt or move it: using mobile phones to continuously control pointers on large public displays. Conference of the Australian Computer-Human Interaction Special Interest Group: 161–168.
  9. 9.
    Boring S, Baur D, Butz A, Gustafson S, Baudisch P (2010) Touch projector: mobile interaction through video. SIGCHI Conference on Human Factors in Computing Systems (CHI’10): 2287–2296.
  10. 10.
    Bradski G, Kaehler A, Bradski G (2013) Learning OpenCV. Google Scholar
  11. 11.
    Card SK, Newell A, Moran TP (1983) The psychology of human-computer interaction. L. Erlbaum Associates Inc., HillsdaleGoogle Scholar
  12. 12.
    Carter M, Velloso E, Downs J, Sellen A, O’Hara K, Vetere F. 2016. PathSync: multi-user gestural interaction with touchless rhythmic path mimicry. Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems - CHI ‘16: 3415–3427.
  13. 13.
    Chen M-Y, Mummert L, Pillai P, Hauptmann A, Sukthankar R (2010) Controlling your TV with gestures. Proceedings of the international conference on Multimedia information retrieval - MIR ‘10: 405.
  14. 14.
    Clarke C, Bellino A, Esteves A, Velloso E, Gellersen H (2016) TraceMatch : a computer vision technique for user input by tracing of animated controls. In UbiComp ‘16, 1–6.
  15. 15.
    Clarke C, Bellino A, Esteves A, Gellersen H (2017) Using body movements in synchrony with orbiting widgets for remote control : an evaluation of TraceMatch. Proc ACM Interact Mobile Wearable Ubiquitous Technol 1:3Google Scholar
  16. 16.
    Dachselt R, User Interface, Engineering Group, Robert Buchholz, and Security Group (2009) Natural throw and tilt interaction between mobile phones and distant displays. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems Extended Abstracts (CHI EA ’09), 3253–3258.
  17. 17.
    Fekete JD, Elmqvist N, Guiard Y (2009) Motion-pointing: target selection using elliptical motions. Proceedings of the SIGCHI conference on Human Factors in computing systems - CHI ‘09: 289–298.
  18. 18.
    Forth J, Wiggins GA, McLean A (2010) Unifying conceptual spaces: concept formation in musical creative systems. Mind Mach 20(4):503–532. CrossRefGoogle Scholar
  19. 19.
    Gallo L, Placitelli AP, Ciampi M (2011) Controller-free exploration of medical image data: experiencing the Kinect. In Proceedings - IEEE Symposium on Computer-Based Medical Systems
  20. 20.
    Ghomi E, Faure G, Huot S, Chapuis O, Beaudouin-Lafon M (2012) Using rhythmic patterns as an input method. Proceedings of the 2012 ACM annual conference on Human Factors in Computing Systems - CHI ‘12: 1253–1262.
  21. 21.
    Gonzaiez C (1996) Does animation in user interfaces improve decision making ? In CHI 96, 27–34.
  22. 22.
    Grahn JA, Brett M (2007) Rhythm and beat perception in motor areas of the brain. J Cogn Neurosci 19(5):893–906. CrossRefGoogle Scholar
  23. 23.
    Guiard Y (1993) On Fitts’s and Hooke’s laws: simple harmonic movement in upper-limb cyclical aiming. Acta Psychol 82(1–3):139–159. CrossRefGoogle Scholar
  24. 24.
    Guinness D, Jude A, Michael Poor G, Dover A (2015) Models for rested touchless gestural interaction. SUI 2015 - Proceedings of the 3rd ACM Symposium on Spatial User Interaction: 34–43.
  25. 25.
    Heer J, Robertson GG (2007) Animated transitions in statistical data graphics. IEEE Trans Vis Comput Graph 13(6):1240–1247. CrossRefGoogle Scholar
  26. 26.
    Hincapié-Ramos JD, Guo X, Moghadasian P, Irani P (2014) Consumed endurance: a metric to quantify arm fatigue of mid-air interactions. Proceedings of the 32nd annual ACM conference on Human factors in computing systems - CHI ‘14: 1063–1072.
  27. 27.
    Hwang I, Kim H-C, Cha J, Ahn C, Kim K, Park J-I (2015) A gesture based TV control interface for visually impaired: Initial design and user study. Frontiers of Computer Vision (FCV), 2015 21st Korea-Japan Joint Workshop on: 1–5.
  28. 28.
    Iannarilli F, Pesce C, Persichini C, Capranica L (2013) Age-related changes of rhythmic ability in musically trained and untrained individuals. Sport Sci Health 9(2):43–50. CrossRefGoogle Scholar
  29. 29.
    Jeong S, Jin J, Song T, Kwon K, Jeon J (2012) Single-camera dedicated television control system using gesture drawing. IEEE Trans Consum Electron 58(4):1129–1137. CrossRefGoogle Scholar
  30. 30.
    Kazemi V, Sullivan J (2014) One millisecond face alignment with an ensemble of regression trees. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1867–1874.
  31. 31.
    King DE (2009) Dlib-ml: a machine learning toolkit. J Mach Learn Res 10:1755–1758. Google Scholar
  32. 32.
    Koskela T, Väänänen-Vainio-mattila K (2004) Evolution towards smart home environments: empirical evaluation of three user interfaces. Pers Ubiquit Comput 8:234–240. CrossRefGoogle Scholar
  33. 33.
    Launay J, Grube M, Stewart L (2014) Dysrhythmia: a specific congenital rhythm perception deficit. Front Psychol 5.
  34. 34.
    Leiva L, Böhmer M, Gehring S, Krüger A (2012) Back to the app: the costs of mobile application interruptions. In Proceedings of the 14th international conference on human-computer interaction with mobile devices and services, 291–294.
  35. 35.
    Mäki-Patola T, Hämäläinen P (2004) Latency tolerance for gesture controlled continuous sound instrument without tactile feedback. Proceedings ICMC 2004, 1998. Retrieved from
  36. 36.
    Malacria S, Lecolinet E, Guiard Y (2010) Clutch-free panning and integrated pan-zoom control on touch-sensitive surfaces: the cyclostar approach. Conference on Human Factors in Computing Systems: 2615–2624.
  37. 37.
    Mayer RE, Moreno R (2002) Animation as an aid to multimedia learning. Educ Psychol Rev 14:87–99. CrossRefGoogle Scholar
  38. 38.
    Mayrhofer R, Gellersen H (2009) Shake well before use: intuitive and secure pairing of mobile devices. IEEE Trans Mob Comput 8(6):792–806. CrossRefGoogle Scholar
  39. 39.
    Mccallum DC, Irani PP (2009) ARC-Pad : Absolute + Relative Cursor Positioning for Large Displays with a Mobile Touchscreen. Human Factors: 153–156.
  40. 40.
    Nancel M, Chapuis O, Pietriga E, Yang X-D, Irani PP, Beaudouin-Lafon M (2013) High-precision pointing on large wall displays using small handheld devices. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems - CHI ‘13: 831–840.
  41. 41.
    Norman DA, Nielsen J (2010) Gestural interfaces: a step backward in usability. Interactions 17:46–49. CrossRefGoogle Scholar
  42. 42.
    Oakley I, Lee DY, Rasel Islam MD, Esteves A (2015) Beats. In Proceedings of the 33rd annual ACM conference on human factors in computing systems—CHI ‘15, 1237–1246.
  43. 43.
    Pal M, Banerjee A, Datta S, Konar A, Tibarewala DN, Janarthanan R (2014) Electrooculography based blink detection to prevent Computer Vision Syndrome. In IEEE CONECCT 2014 - 2014 I.E. International Conference on Electronics, Computing and Communication Technologies.
  44. 44.
    Pérez-Quiñones MA, Sibert JL (1996) A collaborative model of feedback in human-computer interaction. In Proceedings of the SIGCHI conference on Human Factors in Computing Systems, 316–323.
  45. 45.
    Phillips-Silver J, Toiviainen P, Gosselin N, Piché O, Nozaradan S, Palmer C, Peretz I (2011) Born to dance but beat deaf: a new form of congenital amusia. Neuropsychologia 49(5):961–969. CrossRefGoogle Scholar
  46. 46.
    Reyes G, Jason W, Juneja N, Goldshtein M, Keith Edwards W, Abowd GD, Starner T (2017) SynchroWatch : one-handed synchronous smartwatch gestures. Proc ACM Interact Mobile Wearable Ubiquitous Technol 1(4):1–26Google Scholar
  47. 47.
    Seifert J, Bayer A, Rukzio E (2013) PointerPhone: using mobile phones for direct pointing interactions with remote displays. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 18–35.
  48. 48.
    Sharp T, Keskin C, Robertson D, Taylor J, Shotton J, Kim D, Rhemann C, Leichter I, Vinnikov A, Wei Y, Freedman D, Kohli P, Krupka E, Fitzgibbon A, Izadi S (2015) Accurate, robust, and flexible real-time hand tracking. ACM Conference on Human Factors in Computing Systems (CHI): 3633–3642.
  49. 49.
    Shotton J, Fitzgibbon A, Cook M, Sharp T, Finocchio M, Moore R, Kipman A, Blake A (2013) Real-time human pose recognition in parts from single depth images. Stud Comput Intell 411:119–135. Google Scholar
  50. 50.
    Soukupová T, Jan Č (2016) Real-time eye blink detection using facial landmarks. 21st Computer Vision Winter Workshop - Rimske TopliceGoogle Scholar
  51. 51.
    Sowiński J, Bella SD (2013) Poor synchronization to the beat may result from deficient auditory-motor mapping. Neuropsychologia 51(10):1952–1963. CrossRefGoogle Scholar
  52. 52.
    Stellmach S, Dachselt R (2012) Look & Touch : gaze-supported target acquisition. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems: 2981–2990.
  53. 53.
    Stellmach S, Dachselt R (2013) Still looking: investigating seamless gaze-supported selection, positioning, and manipulation of distant targets. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems - CHI ‘13: 285.
  54. 54.
    Stowell D, Plumbley MD (2008) Characteristics of the beatboxing vocal style. Electron Eng:1–4
  55. 55.
    Szentgyorgyi C, Lank E (2007) Five-key text input using rhythmic mappings. Proceedings of the ninth international conference on Multimodal interfaces - ICMI ‘07: 118.
  56. 56.
    Vajk T, Coulton P, Bamford W, Edwards R (2008) Using a mobile phone as a "Wii-like" controller for playing games on a large public display. Int J Comput Games Technol 2008.
  57. 57.
    Velloso E, Carter M, Newn J, Esteves A, Clarke C, Gellersen H (2017) Motion correlation: selecting objects by matching their movement. ACM Trans Comput Human Interact 24(3):1–35. CrossRefGoogle Scholar
  58. 58.
    Walter R, Bailly G, Müller J (2013) StrikeAPose: revealing mid-air gestures on public displays. SIGCHI Conference on Human Factors in Computing Systems (CHI’13): 841–850.
  59. 59.
    Wei T, Lee B, Qiao Y, Kitsikidis A, Dimitropoulos K, Grammalidis N (2015) Experimental study of skeleton tracking abilities from microsoft kinect non-frontal views. In 3DTV-Conference.
  60. 60.
    Winold A (1975) Rhythm in twentieth-century music. Aspects of Twentieth-Century MusicGoogle Scholar
  61. 61.
    Wobbrock JO (2009) TapSongs : Tapping Rhythm-Based Passwords on a Single Binary Sensor. October: 93–96.

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2018

Authors and Affiliations

  1. 1.Università di Milano-BicoccaMilanItaly

Personalised recommendations