Leap Motion Gestures for Spotify

Friends. Family. Followers. Floaters. My four F’s.
Welcome to this blog post.

Note: I say cursor since I don’t want to say touchpad/mouse/etc.

Today (it’s always today; today is today and tomorrow is today), I’m going to focus on my Human-Computer Interaction’s third project which was to do a conceptual design for Leap Motion as an input device for a desktop or mobile app. We would test our design by Wizard-of-Ozing it in user tests. My partner for this project was Jacob Curley and we ended up choosing Spotify as our app. Now let’s explore the use of Leap Motion, a hand-tracking sensor utilizing two cameras and lots of math, in Spotify. This project showed me the limitations of a physical gesture based system and why it might not be ideal in some situations.

leap-motion-3d-motion-gesture-controller-10-large
Image of Leap Motion I took from https://edgylabs.com/leap-motion-hand-vr The sensor is the little rectangular object near the laptop.

Now why did we pick Spotify? There were two main use cases we thought of. One was that users may be working on some kind of project where they utilized their full screen. Instead of having to alt-tab to Spotify, which admittedly isn’t too much work, users can just gesture to do whatever action they’d want with Spotify, such as disliking a song or skipping a song.

The other use case, which was more relevant, was that users may be playing music from Spotify out loud while they’re doing other activities, such as folding laundry or dancing. In these physical activities (versus online activities), the user is more likely to already be in some kind of motion, and by having Leap Motion, the user can just do a quick gesture and get back to whatever they’re doing instead of having to use the keyboard and go through the search for whatever they want to do.

Spotify
Image of Spotify

Brainstorm:

So for our brainstorming, we first thought of relevant Spotify functions. Our list ended up being play, stop, skip, back (to previous song), like, dislike, loop toggle, shuffle, volume up/down, add song to playlist/library, and switch playlist/station. One thing we did not include was queues in Spotify since neither of us used them. Not surprisingly, our list changed after some user testing and some afterthought.

As previously mentioned, Leap Motion utilizes two cameras. It has no depth sensors, but with two cameras it is able to calculate the distance things are from it. By only utilizing two cameras, Leap Motion is able to be cheap, however there are downsides to it. One important one is its inability to detect occluded hands or fingers. For example, if your hands were on top of each other, it can’t detect the hand on top of the other one. Depending on the location and position of your hand relative to the Leap Motion sensor, it might not be able to detect your hands at all. Some cases include having your hand perpendicular right above the sensor. It can’t detect that your fingers are all on top of each other, so it just doesn’t detect any hands at all. With these limitations in mind, Jacob and I thought up of some gestures users in Spotify might want to use for our initial user testing. Here’s the list of them:



stop – stop hand sign 5 fingers out (maps to people’s idea of stop)

play – closed hand, index and middle fingers out (looks like the play sign)

skip/back a song – full hand swipe
see what people do (left vs right; people’s model of moving things/moving through things)

like/dislike songs – thumbs up/down

loop playlist/song – taps to toggle (kind of maps to how you have to do it in app; have to click on it to switch the current mode)

shuffle – full hand swipe up (like throwing papers up and having them get mixed up)

volume up/down – circles (clockwise for up and counterclockwise for down volume) (maps to turning a volume dial)

add to playlist, library – palm up beckon (maps to telling someone/thing to come towards you)

– a menu pops up to add song to playlist/library
– two finger swiping for scrolling through list (like scrolling through a list on a touch screen except with two fingers, since two fingers might be easier to detect and won’t get confused for the loop playlist/song toggle with one finger taps)

switch playlist/station – fist



 

We tried to keep our gestures intuitive but also distinct from each other so that the Leap Motion sensor can distinguish between different gestures. Making our gestures intuitive would allow users to remember and learn our gestures better. An exemplar example of an intuitive gesture is the one for stopping the song. Having all five hands out seems very much like a stop sign, so it would make sense to most users that it indicates stop. The idea behind making our gestures distinct from each other was that it would be bad if a user wanted to skip a song but ended up shuffling it instead. We also decided that users would be able to use either the left or right hand to do actions so that users aren’t limited to just one hand for gesturing in case they are carrying something in one hand, such as laundry.

Initial User Testing:

The first user test was done with Betsy, a fellow student from my HCI class. My professor told me that we should use people outside of our class, but I didn’t do that (sorry professor!). I was thinking of asking random people at the library to participate in user tests, but uh, I didn’t (also did not do as many user tests as I would have liked). For the user tests, I “taught” the users how to use the app and then I had them do different actions, Wizard-of-Ozing through my phone that was linked to my laptop’s Spotify. The screen recording isn’t included in the video since most of the actions can be heard. Here’s a video of the RAW user test including the bloops at the beginning (try finding me in the reflection!):

Some things to note about this user test:

  • The user swiped towards the right in order to skip the song
  • The user mixed up liking the song and adding the song to the playlist
  • The user used one finger to scroll through the playlist option
  • We did not implement a way for users to select what playlist they wanted to add the song to once they were in that interface
  • User thought we had all the relevant functions
    • Queues aren’t that relevant for using Spotify, which is good since we didn’t consider it
  • The wizard gave back audio feedback as to what actions were taken

We decided to keep the skip/back gesture as up to the user’s interpretation for our future tests since it depends on the user’s model of moving things/themselves. As for liking versus adding the song to the playlist/library, Betsy mentioned that we could make the gesture the same as like since liking a song automatically adds it to a user’s library. For her, liking a song was the same as adding it to her library. We considered this for a bit, but then realized (with our next user test and some additional consideration) that users might want to add a song without actually liking it. For example, a user might like a wide variety of music, such as Broadway music and rock, but they might be tired of hearing Broadway music at that time point. The user may stumble upon a Broadway music that they want to add to their library so they can have access to it in the future, but they don’t want to like it since they are tired of hearing Broadway music. We don’t want to force our users into liking songs in order to add it to their library (which is kind of a dark pattern). This example is based on a true story : )

Betsy used one finger to scroll through the playlist options once she was in the menu for choosing where the song should be added. This may have been because we use one hand to scroll through things on a touchscreen. Although we could have changed our scroll to be one fingered instead of two, we thought it might have been hard for the sensor to detect the difference between this and loop toggling (however, we could also disable the ability to loop toggle while inside this menu).

We decided that there should be some kind of feedback, either audio or visual, so that users can know what actions have taken place, especially for toggling the loop function. It’s also good to have this feedback so that users can know that their gestures have been registered. Audio feedback makes the most sense since users might not be close enough to their screens to see the visual feedback depending on their task, also it might be a little disruptive to the music playing.

Leap Motion Gesture Check:

Since our gestures were mostly set, we went to the Leap Motion sensor and tested our gestures against it to see whether it would register them or not. Our original idea was to have the sensor propped on the laptop screen where a webcam might normally be, but that did not work out very well with our gestures. For example, it couldn’t detect our add to playlist/library gesture because it was a gesture where the hand was mostly perpendicular and leveled to the sensor. As a result, it couldn’t detect the hand. The sensor also detected an invisible hand/arm quite a few times while being propped. Rogue ghost loose!

We decided to have the sensor on a flat surface, such as a table or above the laptop keyboard instead. It seems that the sensor was made more for this kind of sensing so it was more able to detect our gestures at this angle. The Leap Motion sensor was made more for VR where the sensor would be propped on the VR set where the eyes would be and the user would put out their hands in front of it. This means that it’s not really meant for detecting gestures from a 2nd point of view (where the sensor is propped on the top of the laptop) but for a 1st person point of view (where the sensor is on a flat surface).

However, there were still some things that it could not detect while on a flat surface. A short list of changes:



fists (for switching playlists/stations) – the Leap Motion couldn’t see the fingers that were hidden in the fist, so it did not detect it correctly. We changed it to be a wave/shaking hands since that seems to map to change(s). It’s like mixing stuff together.

thumbs up/down – similar problem as fists; changed to be an index finger up or down since more fingers are exposed that way

play sign – the Leap Motion wasn’t able to detect the play sign that well so we changed it to be a peace/bunny sign instead, which was basically the same thing, except more fingers were exposed.



The Leap Motion best detected our gestures when they were done at a distance, since some of our gestures had fingers on top of each other, so the Leap Motion would be placed above the keyboard for when users wanted to use it while at the laptop, and it could be moved somewhere else if the user wanted more range while they were away from the laptop.

New gestures:

 

More testing!

Now onto a user test with my friend Alexis with the improved gestures! She decided to use more freedom in what gestures/actions to take. (Song we were referencing to in test: My skinny flacca by Huecco Lobbo. I found it while listening to Spanish songs on Spotify)

Findings:

  • Wizard-of-ozing with the phone to control the laptop likes/dislikes does not work since likes/dislikes are separate for each device somehow.
  • This user test is kind of realistic in that users could be hanging out with their friends and then decide to do an action on Spotify while still being focused on the conversation. Not as much concentration is needed compared to controlling a cursor and searching for the buttons on the laptop. Someone could do an experiment to see how quickly people can do the gestures versus using the touchpad/mouse to click on the buttons.
  • User swiped for skip towards the left.
  • Volume up/down was a little bit much
  • The user wanted a mute function
  • Could not remember new play gesture (not intuitive) (we changed it before the test while she was “learning” the gestures)
  • No way to remove songs from library (mentioned before testing)
  • Before the test, the user told us that adding songs to the playlist was not that necessary for using Spotify

Lots of changes!

Although Alexis wanted a mute function, we decided that it would be a bit too much to implement it; too many gestures, and is it necessary? A closing fist might make sense for muting, but the user could just pause the song and achieve the desired effect of silence. She mentioned that we could use a finger moving up/down for volume up/down instead, except for our gesture based system, we already had a finger pointing up/down for liking/dislike, which the Leap Motion sensor might confuse a finger moving up/down with especially if it’s moving slowly. I do agree that making the circle motion for the volume requires a bit of work though.

She also could not remember what the play song gesture was and immediately did the stop gesture. I think Alexis thought of the play/stop as more of a toggle instead of two different options, since they were opposites of each other, or maybe it was like a button you push with your full hand. We decided to make it so that either the peace/bunny sign and the stop sign would work for playing the song (stop sign would stop the song if the song was playing and would play the song if the song was stopped). This way the play/stop would also be like a toggle and users could fall back to using the stop sign to play the song if they forgot the bunny/peace sign (since this gesture isn’t intuitive). We couldn’t think of an intuitive gesture for play that the leap motion could detect.

Alexis also swiped towards the left to skip a song, and she mentioned that in the Spotify app, that’s how you swipe to go through different songs. As a result, we decided to make skipping as a right-to-left swipe and going back as a left-to-right swipe so that it follows the gestures on the phone. In theory, users would be able to change this similar to how they can invert scrolling directions.

Another thing she mentioned was that we couldn’t remove songs from the library; we could only add them. This reminded us of dark patterns in that if you add a song by accident somehow, you can’t remove it and it’ll be in your library forever (until you go to your laptop/PC and manually remove it). We decided to add a gesture to remove songs from the library that was basically the opposite of adding a song to the library. Instead of a closing palm up (beckon), it would be a closing palm down (like pushing something away). This gesture makes sense since adding and removing songs are opposites and this gesture reflects that opposition.

Another change we made from this test was the removal of gestures for adding songs to a playlist. It was too complicated since it required the use of a pop-up interface that the user needed to view and select from. The whole interaction of going through the list of playlists was a bit clunky since users could only move through one playlist at a time. Besides that, for our use cases, users would mostly be using the app to listen to songs, not necessarily using it to create playlists. Removing it lessened the mental requirements for using our gesture set with Spotify (less gestures to remember).

We also removed the ability to switch between playlists/stations for similar pop-up UI reasons. We did not consider the fact that the user would have to select or search for the playlist/station they want to switch to. It would simply be easier for the user to go to their laptop and look for the station they want. Considering that the user would not be doing this often in our use cases (I don’t think users change playlists/stations often in general either), we discarded this function. We moved the wave gesture we were using for this function to the shuffle function. Although a hand up swiping kind of maps to mixing things up, a wave/shaking hand maps more to it since it’s like shaking things together (like holding a jar and shaking it). It maps better and also requires less arm movement, making it easier to do.

Our final gesture system was more simplified compared to our original idea and did not require any addition of a large UI change (like a pop-up menu). Considering our user tests, it seemed like most of these gestures were memorable.

Final gestures:

 

Here’s a video of our final gestures in action!

A Final Mirror (cause it reflects – haha? : ( not funny)

Initially, we tried to convert the more functional aspects of Spotify into our gesture system. However the use of some of these gestures, mainly the add to playlist and switch playlist, required the use of additional UI elements. While these UI elements wouldn’t necessarily be too hard to implement, using them with our gestures did not exactly work out. For example, instead of using two-finger swipes in the air to scroll through each playlist on the list of playlists one-by-one, a user could just use their cursor and go straight to the playlist they want to go to. The addition of a gesture system here isn’t needed, and it would be inferior to using the cursor in terms of both speed and effort (the cursor would have a lot less fatigue).

Additionally, in our use case where the user isn’t at their laptop, the additional UI elements kind of takes away from the point of using our gesture based system. It would be required for the user to continuously stay at their laptop until the whole selection is done. But even more, would the user even use this function in that use case? Probably not. It’s more common for users to just like/dislike songs and add them to their library. Adding songs to a playlist would most likely happen in a sit-down session where a user is trying to create a new playlist off of songs they already have in their library, and the user wouldn’t necessarily be switching playlists/stations that often. All of these different factors together led us to get rid of the add to playlist and switch playlist/station functions.

The removal of these functions from our gesture based systems shows how physical gesture based systems cannot fully replace the keyboard and cursor (especially considering the limitation of hand-tracking sensors like the Leap Motion). Accuracy aside, gesture based systems require a lot of movement. Our end product limited the amount of horizontal and vertical movements in our gesture based system, which would decrease the amount of fatigue. And considering that users wouldn’t be gesturing to Spotify for long periods of time, the amount of fatigue from our gesture based system for Spotify would be a lot less than the amount of fatigue from a gesture based system for navigating the web.

Considering the accuracy of the sensor though, it limits the choice of gestures greatly to the point where non/less-intuitive gestures have to be used. For example, our like/dislike was initially a thumbs up/down, which is very intuitive, but we had to change it to an index finger up/down since the Leap Motion sensor couldn’t recognize the like/dislike that well. Requiring the use of non/less-intuitive gestures makes it harder for users to learn what gestures they should use to activate the functions they want to activate.

Overall, when creating a gesture based system for an app, one (you, me, us, the world) has to consider whether these gestures actually complement the existing functions of the app. These gestures should not replace what is already there, but instead add additional functionality or possible usages (in our case, allow users to quickly do actions in Spotify without having to go through their desktop/find their cursor). It was fun being a wizard for a week.

Design a site like this with WordPress.com
Get started