See the original story in Japanese
MEMOPATCH recently met with Espen Systad, co-founder and CEO of Capsule.fm. The Berlin-based startup is developing an artificial persona which reads you the news, tells you the weather, or simply plays you your favourite tunes. Everything under a impressively simple User Interface. We talked with him about the future of user experiences and, along with that, the future of Capsule.fm.
- Tell us about Capsule.fm
We started Capsule.fm as a project two years ago. It is simply impossible to touch your smartphone and read something whilst simultaneously exploring audio content. Out of this struggle, the idea was born to create a new truly native mobile media. The content we had back then was different, but it had music, news, and social media updates. Eight months ago, when we launched the wake-up app, it had the morning alarm as the main function. It would inform you of the weather, read you the news and then play you music. With the recent global launch we made it to the No.1 spot in the News category in the App Store in 28 countries. In Norway, our home country, it went overall No. 1 App. We had continuous improvements, added more music streaming services, and also started to speak more languages. Currently we speak English, Norwegian, German, and next: Japanese.
- What was the biggest challenge for development?
That artificial personality is the computer voice talking to you. The biggest challenge was making her pronunciation natural and her spoken content correct. First we had to create a complex library in order for the computer to put together proper sentences. In addition the text needed a lot of extra treatment to sound natural when spoken by the computer voice. We made it sound better every day. The Norwegian voice was horrible at first but now we have been able to achieve a high level of quality.
After HER came out investors reaction changed
- The concept reminds a lot of the movie “Her”. What impact did that movie had for you?
Capsule was developed before “Her” came out. However, it really changed the way investors reacted to our idea. Actually it helped a lot. Some skepticism was replaced by understanding after seeing the film. Investors were able to envision an intimate relationship between human and computer and they liked it.
- By the way, do you actually speak back like in ”Her”?
For now, the audial interaction is only one-way. The user listens but his or her interaction with the app is through touch gestures. I don’t really believe in the voice input — people look at you strangely in the train when you speak to yourself.
- What is the vision for the product?
Well, it will be a true audio browser. A platform that connects users with audio content. We reached about 30% of this vision. We have daily goals which bring us closer, step by step.
Our Japanese users love the simple interface
- You have launched globally. Is there any difference between how users in different countries interact with the product?
- It is super interesting to see different trends in different countries. For example, the Japanese seem to love the news, where this is the first thing Americans switch off, prefering to hear jokes and entertaining stuff. The number of interactions between the user and the app is also really high with the Japanese user. Where the American user is tapping once, the Japanese tapped 20 times. I believe this also has to do with how Japanese society seems to have less fear of contact with artificial intelligence and robots. It seems there is also more tolerance towards the voice of a computer.
The Japanese version is not out, yet we got a very positive reaction from people who practice English with the app. We are also popular for our music content. Streaming services such as SoundCloud are not used as widely and Spotify cannot be used in Japan. There is a special music culture here, such as Vocaloid. But how we can utilize this specific sub-culture for the app is still open.
I also heard some Japanese users say the user interface is too simple which makes it hard to use. European app design is often minimalistic. But this can be at first confusing if you are not used to it.
A personal relationship between human and computers
- By the way, has anyone fell in love with the voice à la “Her”?
- Well, we measure amount of content consumed in units called “1 Capsule”. There are some users who listen to 1000 Capsules a day.
The English version of the Female personality is called Miranda. When we implemented changes, we’ve had heavy users who have left us comments like, “This is not Miranda”. At first, we were not sure if it’s better for the artificial person to become very close to the user or to maintain a certain distance. We soon found out that you want to stick with one artificial voice and keep it consistent. Miranda, for example, is pretty cheeky and sometime even bullish.
A good way to make our characters appear more human-like is to incorporate humour. Humour only works when common ground is shared. Sort of a shared reality. Our goal is to allow the user to have such shared reality or context with the computer.
Capsule.fm Tokyo Meetup
Capsule co-founders Espen Systad and Tor Langballe’s are coming to Japan in February. On February 6th, from 19:30, we will hold a “Capsule.fm × Berlin Startup Night” at the Shibuya office of Goodpatch. Join us by registering here.
Espen Systad on Twitter: https://twitter.com/revesjef
Capsule.fm on Twitter https://twitter.com/CapsuleFm
Capsule.fm on Facebook: https://www.facebook.com/Capsulefm