KebbieRobot – Blueberry Hummingbird

companion Robot

Kebbi Robot

Pet, Educator, Playmate, Smart Housekeeper, Storyteller

intro

Kebbie is an companion robot. It brings happiness to people with its lively characteristic, hilarious motion design, and smart ai functions.

The robot gets hungry and falls asleep easily. It snores and dances to music. It’s also an smart housekeeper and STEAM educator that teaches one-on-one english conversation lessons.

context

It is not easy to find accompany when one needs it.
Parents don’t have enough time for children. Friends don’t get to meet until holidays.

design goal

Before I joined Kebbi’s UX team, a few prototypes has been built and tested. The robot was named ‘Mibo’, with harmless outlook and curvy shape. It was set to appeal users for being:

Smart: equipped with sensors and AI functions.
Fun: play games, act with likable behaviors.
Unique: create a killer character setting.
Providing Emotional Value: be comforting and healing.

affective sound design

Mibo was silent when I joined the team. We started building affective system for the robot.

My teammates were passionate about acting robot’s emotional voice. I layered and processed the voices, pairing the frequency range with robot’s TTS voice.

The robot was designed to switch emotions while playing different characters.
I dogged into its story settings, sketching each character’s potential personalities and related sound elements.

We wanted the robot to act like a human, in a cute way. That’s why it would fall asleep when user’s not around. It would snore like a dude:

Users might awake the robot when they feel like talking to it.

text-to-speech system problem

When we designed robot’s voice and script for OOBE(Out-of-box experience). We found that it was speaking in an odd pace. Time intervals between words were short. ‘Hellotheregoodtoseeyou!!!!!’

We added spaces, comma, tilde, question mark, and exclamation manually to express proper tones and emotions.
we also used homophones to make it sound more natural.

first sound sketch

Assuming the robot was from an abandoned planet,
It might be pretty good at machine recycling, like Walli.

I added machine sounds to Mibo, but something went wrong. It sounded heavy, busy, and unhappy. I realized it was already walking with servo motors, before I ever add any sfx on top of it.

I used light and dominant sound instead. Frequency was set to range from 800-10000 Hz, which is meant for most human ear. Lower frequency band was reserved for robot’s own body sound.

Walli is from the past, but Mibo is from the future. We wanted it to carry the image of digital, futuristic, delightful, and playful. I added a little bit tech feel to its sounds.

The robot would speak in different pitch and speed when playing characters and express emotions. We set default and floating values for its TTS system. We planed to import sound engines with randomized calculation, which would bring more variety to it.

Can robots sound like human?
I looked for sound plugins and tried adding mechanical feel. Somehow it felt more connected when it sounds more like human.

what we hear first

If all sounds come with the same power, it will be a mess.
Our Brain will decide what is more important. Based on my observation, people tend to perceive words that carries meanings, and feel emotions from tones of the voices.

mixing robot sounds

Therefore we put robot’s voice and emotion front and center, while other sound details would enrich the overall audio experience.

It’s like mixing a song. Robot’s VO was set to a moderate and clear level. Emotion sounds kept their dynamic range from -3~-12dB, for users to customize their audio distances with the robot.

System sounds were classified into 3 groups in terms of importance. We didn’t want our users to be interrupted by the ‘Next Step’ button. But they might want to know if the robot is out of power.

Minor sounds were kept simple.
Users would hear more after seconds of silence.

hareware limitation

Sound data below 400 Hz was weak due to speaker limitation. To improve the sound texture, We brought back some frequency between 300 Hz to 500 Hz.

pain point#1. OOBE Steps

The first pain point: Users cannot finish OOBE steps.

Mibo was released with hundreds of bed time stories.Turned out our OOBE steps were not so easy to complete. There were exceptions of previous use flow, and it took a whole lump of time, which is relative hard for busy parents.

The OOBE flow was revised and simplified. In case the flow is interrupted and users forget about earlier instructions, we added search bar, feature navigation system, and tutorial videos for user to resume the process when they ask for it.

In newer version, users would touch the robot’s body to wake it up. and the robot, who is set to come from outer space, would introduce itself and walk users throughout the OOBE steps.

pain point #2. TTS quality

The second pain point: Users complained about the TTS voice quality.

Sound waves crashed when we increased overall volume level of the robot. We added compressor and limiter to the sound data and redefined robot’s audio levels.

robot’s theme song

To introduce the robot to the market, we made a theme song.

We pictured the scene of Mibo’s home:
Twisting cloud towers, cars with big eyes, candy rivers, and fat mountains. We pictured a wonderland for the robot, more fluffy, floating in the sky.

We analyzed lyric structures of Chibi Maruko Chan and Doraemon’s theme songs and found they both have rhythmic chorus. We designed our catchy repetitive lyric ‘Don Don Ju Lu Lu’.

Music genre was set to fantasy adventure.
The song’s upbeat and cute, with bits of electronic musical instruments.

We pictured Kids flying with the robot to the wonderful future.

We invited children’s musical actors and amateur singers to sing the song. We let kids vote for their favorite voice.

And, People started hearing this song a lot.

Kebbi Robot Performing Theme Song

pain point #3. awakening word

As the robot started selling in China, we encountered the third pain point:
The robot won’t respond to the awakening word when user call for it.

According to users from China, the robot had problem hearing female and kids. (voices with higher frequency and lower sound pressure)

There’s no easy solution for this problem. Our users in China were from different regions. We recorded sound samples with various accents, then sent files to our voice recognition team. We strengthened the sound data, trying to improve voice functions. We also educated users to touch Danny’s belly, which would also wake it up immediately.

music waking system

To solved the voice awakening problem, we brainstormed music waking system, it could be another intuitive communication option.

Darwin thinks mammals share the same emotion expression system.

Trees rustling, a breath of wind, impacts of rain drops.
A light-vented bulbul nesting in my tree, jumping up and down, yelling loudly.

We don’t perceive animal’s language, but we feels its emotion.

Despite difference of physical vocal structure, mammals produces mid-high, light, and harmonic tune to express happiness. They usually sound low, weak, discord when they feel down.

We can hear both: What makes individual animal different, and ups and downs of their emotions.

For human, a short melody is interpreted as a sentence thats carries message.
Music elements (pitch, volume, intonation, rhythm, chord, harmony) will as a whole change user’s mood and perception.

In our scenario, users looked for the robot, and the robot turned to users.

User : Call ( ‘Where are you? ‘Hello?’, inquire/greet )
Robot: Response ( ‘Yes I am coming!’, positive/present/active )

We ideated sets of melody based on the emotion flow:

1. Call: Do-Mi-Fa-La (jump forward)
Answer: Sol-Mi-Do (stable c chord)

2. Call: Mi-Do-Mi (simple and harmony)
Answer: Mi-Do-Do (rhythmic)

3. Call: Do-Fa-Do-Mi (friendly and hope)
Answer: Si-Sol-Do (positive)

4. Call: Do-Fa-Mi (inspired from animal sound)
Answer: Do-Sol (Do-Fa-Mi-Do-Sol is a power chord)

5. Call: Mi-La (inspired from short bird chirp)
Answer: Do-Do-La-La-Mi (A major chord, full of energy)

To implement this idea, our next step was to evolve robot’s voice sounding structure. The challenge was to make robot sing the melody, or to find a melodic word to say.

affective computing and emotion model

The robot was officially named to ‘Kebbie’. We planned to upgrade the robot’s language system and emotion engine.

We defined the new Kebbi an emotional robot.

For the robot to get alone with users with different characters, I researched Enneagram of Personality and found that certain personalities go well with another. A group of personality traits were fetched for users to customize Kebbi’s default characters.

Based on the default characters, Kebbie would be born with a set of actions that map to its preset emotions, which is stored in its default emotion model.

Everyday, Kebbi would detect emotions from the weather or from users.
It would learn and remember user’s preferences.
Interactions from user would add emotion inputs to Kebbi, and Kebbi would ‘unlock’ new sets of emotions to respond to the users.

– Emotion inputs : emotion detection and inputs of emotion values.
– Emotion management hub: link to Kebbi’s brain and emotion.
– Learning and Memory center: capable of reacting to situation happed be4.
– Emotion model: 2D model that store current and hidden emotion.
– Mood meter: temporary emotion container, only last a few days.
– Expression system mapping to certain emotions.

Over time, Kebbi would grown to be unique for individual users.

prototype: lazy dog

Default personality: Lazy dog
Emotion model : Happiness (single axis)

Basic Emotions: Idle, curious, fear, angry, relax, happy
Advanced Emotion: bored, excited, hope, ecstasy
Hidden Emotion: spoiled, love, disappointment.

At first, the dog would be born to be lazy and proud, with series of behaviors.
It would remain idle most of the time, shaking tails, smelling with absent mind.

When emotional input events happens, It would change emotions and trigger actions.
It would get scare and whine when go to a doctor.
It would remember the taste of meat, hope and beg for it.
If it gets both the meat and an extra long walk, the dog would unlock new ecstasy emotion.

And if the user happened to pet the Dog’s belly, it would unlock love emotion.

All emotions was designed to fade in time.

After a few hours, The dog would be back to the idle mode.

STEAM educator robot

While we planned out the potential ai emotion mechanism, we heard needs for a STEAM educator teacher.

Our users were hungry for contents. Kebbie did a great job as an english conversation teacher. Our robot, as we wish, has been providing emotion values, though games and education.

future improvement

Kebbi survived in covid-19.
During the pandemic, it wore mask and checked temperate for patients.

Up to recently, Kebbie still evolves, and remains user’s cute little friend.