User management

Users are detected by the camera and are assigned a unique id which is kept constant as long as the user is in view. If the camera loses the user for more than a brief period, the user is likely to be assigned a new id, and will therefore be treated as a new user. For now, Furhat doesn't have a memory of past users.

In the skill, a UserManager keeps track of users and raises events related to the status of the users. The User manager can be accessed in the flow through the userManager property.

Engaged users and EngagementPolicies

Just because Furhat can see a user, it doesn't mean that the user is interested in talking to Furhat. Therefore, the system keeps track of which users are engaged (i.e., engaged in the interaction). To decide which users are considered to be engaged, the system uses different engagement policies. For complex situations, you might want to build your own EngagementPolicy.

SimpleEngagementPolicy

If nothing else is specified in the skill, the system uses the SimpleEngagementPolicy. This policy says that when users enter an inner interaction space ellipsis (circle with 1 meter from the robot as default), they will be considered to be engaged. To add robustness, a larger interaction space (0.5 meters outside the inner space per default) is used to trigger users leaving the interaction (becoming disengaged). In the default setting, the number of engaged users is also limited to 2.

To set a custom SimpleEngagementPolicy, you use the following method (we recommend doing it in your first state's init method):

val Start : State = state {
    init {
        // Set the inner space to a circle of 2.0 meters (outer distance becomes 2.5) and maximum users to 3
        users.setSimpleEngagementPolicy(2.0, 3)

    // Set the inner and outer spaces to circles of 1.2 and to 1.7 meters respectively, with two maximum users
        users.setSimpleEngagementPolicy(1.2, 1.7, 2)

    // Set the inner and outer spaces to a narrow ellipsis ahead of the robot, for only one user (note, that a SingleUserEngagementPolicy might be a better idea here)
        val policy = SimpleEngagementPolicy(
            userManager = userManager,
            maxUsers = 1,
            innerRadiusX = 0.25,
            outerRadiusX = 1.5,
            innerRadiusZ = 1.4,
            outerRadiusZ = 1.6
        }
        users.setEngagementPolicy(policy)
    )
    }
}

SingleUserEngagementPolicy

For single-user interactions, where only one user is expected to interact with Furhat, standing straight in front of the robot, there is also a SingleUserEngagementPolicy. This policy takes two parameters that constrains which users are considered to be engaged: maxDistance, which is the maximum distance to the user, and maxCenterAngle, which is the maximum angle from the center line (x-axis) to the user. If several users are within these constraints, the closest user is selected. An engaged user is allowed to leave these constraints without becoming disengaged, as long as no other potential user is detected.

// Use default parameters (1.5 meters distance and 15 degrees angle):
users.setEngagementPolicy(SingleUserEngagementPolicy())

// Specify your own parameters:
users.setEngagementPolicy(SingleUserEngagementPolicy(maxDistance = 2.0, maxCenterAngle = 20.0))

Users entering and leaving

When a user becomes engaged (i.e., enters the interaction space), you will get a SenseUserEnter event, and when a user becomes disengaged, a SenseUserLeave event is sent. These can be caught in the flow like this:

val Idle = state {
  onEntry {
      furhat.attendNobody()
  }

  onUserEnter {
      furhat.attend(it)
      goto(Attending)
  }
}

val Attending = state {
  onEntry {
      furhat.say("Hello there")
  }

  onUserLeave {
      furhat.say("Goodbye")
      goto(Idle)
  }
}

As can be seen, the user entering can be accessed through the it variable.

Accessing users

The UserManager keeps track of the current user. This is typically the user Furhat is currently attending. When Furhat detectes speech, the default assumption is that that it is the current user who is speaking. The current user can be accessed in the flow through the users.current property in the flow. The specific getters you can use are:

  • users.current: Gets the current user. This is usually the user Furhat is attending, unless you have instructed Furhat to attend a location or Furhat is currently glancing at something.
  • users.random: Gets a random user from the set of engaged users, including the current user.
  • users.other: Gets a random user from the set of engaged users, which is not the current users.

Each of these properties return a User object.

User data

The User object can be extended to carry skill-specific data. The most convenient way to do this is to create extensions property in Kotlin. There is a convenient delegate called UserDataDelegate that can be used for this. It is recommended to place these extensins in a separate file, such as "user.kt:

var User.score : Int? by UserDataDelegate()     

var User.played : Boolean? by UserDataDelegate()

Note however, that these fields are not null-safe. If you want to make them null-safe, you should instead use the NullSafeUserDataDelegate:

// Since it it null-safe, we have to provide a default value for the field, using a closure
// The type for the field does not have to be specified here, since Kotlin can derive it automatically from the closure
var User.score by NullSafeUserDataDelegate { 0 }

var User.played by NullSafeUserDataDelegate { false }

This way, you can easily store and access information with strong typing:

users.current.score++

if (users.current.played) {
    // Do something
}

Attention

Attention is an important aspect of a robot interaction, especially if it is multi-party. When an agent (user or system) is attending to (looking at) another agent, this typically means that the attended agent is supposed to respond. Thus, we need to be able to both control the attention of the system agent (the Furhat robot) in an appropriate way, and check the attention of the users.

Attending users

When you attend users, two things happen:

  1. The robot will turn to the user
  2. The UserManager will change the current user to the new user

Note: if you attend a location, the current user will not change

Use the furhat.attend() method to have the robot attend users. To access specific users, please see User management.

furhat.attend(user)                   // Attend a specific user
furhat.attend(userId)                 // Attend a user based on userId
furhat.attendNobody()                 // Reset attention
furhat.attendAll()                    // Attend all engaged users (switching the gaze between them)

To attend users based on responses or events, use the following examples

onResponse {
  furhat.attend(it.userId) // Attend the user of an onResponse
}

onUserEnter {
  furhat.attend(it) // Attend the user of an event
}

Attending locations

You can also choose to attend a location as follows:

// Attend a location at x = 1.0, y = 0.0, z = 2.0, i.e. 1m right and 2m in front of the robot
val location = Location(1.0, 0.0, 2.0)
furhat.attend(location)

Note that the coordinate system is placed with the x-axis to the robot's left, the y-axis up, and z-axis to the front of the robot. There are also several built-in locations, such as Location.UP.

Advanced control

For more advanced control of the head's behavior, you can supply optional arguments.

furhat.attend(User/Location, gazeMode = ActionGaze.Mode, speed = ActionGaze.Speed, slack = Int) // Will move slowly and only move the head enough to satisfy the maximum slack parameter.

ActionGaze.mode is a parameter to set different modes of attending a user (or location):

  • Default: Move to the exact gaze location whenever the slack value is exceeded.
  • DeadZone: This option is useful to minimise unnecessary head movements when alternating attention between two users. Instead of moving all the way to the target gaze location, it will adjust the head as much as it needs to satisfy the defined slack threshold.
  • HeadPose: Neck and eyes are fixed together without regard to the slack.
  • Eyes: only move the eyes.

Slack is a parameter that allows the robot to keep its head more steady and not do many small micro adjustments. The slack value is the allowed offset of the head position from the target gaze location. A low slack value will cause the robot to constantly re-adjust its head position. A higher slack value will cause the robot to maintain its head pose and only adjust its eye gaze. Default value is 0 degrees.

ActionGaze.Speed is a parameter to set the speed of the head movements.

Example usage:

furhat.attend(users.current, gazeMode = ActionGaze.Mode.DEADZONE, slack = 10) // Attend user, but only move the head enough to satisfy the maximum slack parameter.
furhat.attend(Location.Down, gazeMode = ActionGaze.Mode.EYES) // Looks down with eyes only - same as the deprecated eyesOnly above
furhat.attend(Location(0.0, ), speed = ActionGaze.Speed.XSLOW) // Slooowly face the bottom left with mode ActionGaze.Mode.DEFAULT (eyes move first, head follows)

There are several optional parameters for manually controlling the head position, instead of using a gazeMode. You can set the head position to "look" at a location, or you can set an exact head rotation.

furhat.attend(User/Location, headTarget = Location(), headRoll = 10.0, speed = ActionGaze.Speed.XFAST) // Eyes look at user/location, head faces target, with roll around z-axis (see location)
furhat.attend(User/Location, headPose = Rotation(), speed = ActionGaze.Speed.MEDIUM) // Head pose is defined as tilt (x-axis), pan (y-axis), roll (z-axis) angles, applied in that order

Example uses:

furhat.attend(Location.UP_LEFT, headTarget = Location.UP, headRoll = 10.0, speed = ActionGaze.Speed.SLOW) // Slowly look up with a thoughtful roll of the head while the eyes travel up to the left. Perhaps you ought to be using a thoughtful gesture instead? 
furhat.attend(users.random, headPose = Rotation(15.0, 0.0, 0.0)) // Attend a user, but with a fixed head pose skewed downwards.

Glancing

In multi party interactions, sometimes you might want to acknowledge a user that you are currently not attending without changing your current user. Glance is a good way to do this.

furhat.glance(users.other) // glance on a random user that isn't the current user

furhat.glance(users.other, 2000) // glance for 2 seconds (2000 milliseconds)

The use-case might be that you are talking to a person and another user is entering the interaction, in which case you might want to glance at the new user for a few seconds before reverting attention to the current user. It's important to remember to set the instant = true option, otherwise the trigger will interrupt any called state. For more info, see Triggers when calling states.

Example below:

onUserEnter(instant = true) {
  furhat.glance(it) // glance on the new user before reverting attention.
}

Checking Furhat's attention

The current user can be accessed with users.current.

Note that the current user might not always be attended, for example if Furhat is glancing somewhere or is attending a location. To check Furhat's current attention, you can use:

// Check if Furhat is attending a specific user
furhat.isAttending(user)

// Check if Furhat is attending some user
furhat.isAttendingUser()

// Check if Furhat is attending all users
furhat.isAttendingAll()

// Check if Furhat is attending a specific location
furhat.isAttending(location)

// Check if Furhat is attending any location (but not a user)
furhat.isAttendingLocation()

Checking users' attention

You can also check the users' attention:

// Check if user is attending Furhat
user.isAttendingFurhat()

// Check if user is attending a specific location
user.isAttending(location)

Note: The users' attention is derived from their estimated head pose. However, head pose is not a perfect indicator for a person's gaze target, and head pose estimation from video is also not very accurate. Therefore, the above methods only gives a rough indication.

To react to changes in the user's attention, you can use an onUserAttend trigger, which is triggered every time a user's attention shifts towards or away from Furhat:

onUserAttend(instant = true) {user ->
    if (user.isAttendingFurhat()) {
        println("User ${user.id} is now attending Furhat")
    } else {
        println("User ${user.id} is now attending somewhere else")
    }
}

The user's attention is computed using two thresholds (for hysteresis), one for gaining attention (the gaze angle falls below the threshold) and one for losing attention (the gaze angle goes above the threshold). The gaze angle is the number of degrees between the user's gaze direction and the direction to Furhat. You can change these thresholds if you want (the former should be always be smaller than the latter to add robustness):

// Let's decrease the thresholds somewhat
// Default value is 20.0
users.attentionGainedThreshold = 15.0
// Default value is 35.0
users.attentionLostThreshold = 25.0

Identification

When using the ReSpeaker microphone, the external microphone provided with Furhat, Furhat can identify the direction sound comes from. Using this information, Furhat is able to guess the identity of the person who spoke. To identify who spoke you can use this example code which gives userID of the user who spoke as a response.

onResponse {
       val userThatSpoke = it.userId
}

Gesture detection

Note: This functionality is not accessible when running on the virtual robot

Recognizing a user's facial gestures is a very powerful tool when developing social interactions for Furhat. Through our camera Furhat can detect user facial gestures, and a skill developer can access this data.

Currently Furhat only supports the detection of a user smiling.

Example

In this example it is shown how a trigger can be created for a user starting to smile, and a global trigger for a user stopping any gesture.

//Trigger on a smile
onUserGesture(UserGestures.Smile) {
  furhat.gesture(Gestures.BigSmile)
}

//Trigger on any gesture starting
onUserGesture {
   furhat.say("Wow, that is a really cool gesture")
}

//Any gesture ending.
onUserGestureEnd {
  furhat.gesture(Gestures.ExpressAnger)
}

The following fields are accessible from the triggers:


onUserGesture {
  it.gesture //String representation of the gesture
  it.userID //ID of the user that is showing this gesture
  it.isCurrentUser //Boolean depicting if this is our current user.
  it.conf //Average gesture value that activated the trigger
}

onUserGestureEnd {
  it.gesture //String representation of the gesture
  it.userID //ID of the user that is showing this gesture
  it.isCurrentUser //Boolean depicting if this is our current user.
}

Just like any other triggers they can also have a condition, be instant or have priority.

Parameters

There are three parameters that can be influenced.

  1. entryThreshold The threshold to consider the gesture to be active. The gesture value needs to be above the entry threshold
  2. exitThreshold When a gesture is active, the value needs to be below the exit treshold in order to become inactive.
  3. filterWindow The gesture value is an average calculated over a period of time. When the filter window is bigger, more frames will be considered when calculating the average. If the filter window is smaller, less frames will be considered when calculating the average.

Params can be changed/accessed like this:

//Params with their default values
UserGestures.Params.Smile.entryThreshold = 0.7
UserGestures.Params.Smile.exitThreshold = 0.5
UserGestures.Params.Smile.filterWindow = 5

Defining your own EngagementPolicy

TBA

External Feeds

With our 1.20.0 update our system allows for the user to access the camera feed from our built-in camera. The audio feed is also available since 2.6.0.

Note: The SDK only supports the audio feed, we don't support computer cameras or other video feeds. On the robot, the feeds are only exposed for thoses of type 'Research'.

Enabling/Disabling

First the feeds have to be enabled, this can be done in two ways. Either programmatically inside a skill, or by using the web-interface.

Programmatically the feed can be enabled by using the furhat object (available inside a state) like this:

val example = State {

  onEntry {
    /** Camera feed */
    furhat.cameraFeed.enable()     //Enables the camera feed
    furhat.cameraFeed.disable()    //Disables the camera feed
    furhat.cameraFeed.isOpen()     //Returns a boolean depicting if the feed is open
    furhat.cameraFeed.isClosed()   //Returns a boolean depicting if the feed is closed.
    furhat.cameraFeed.port()       //Returns the port of the camera feed as an Int.

    /** Audio feed */
    furhat.audioFeed.enable()     //Enables the audio feed
    furhat.cameraFeed.disable()    //Disables the audio feed
    furhat.cameraFeed.isOpen()     //Returns a boolean depicting if the feed is open
    furhat.cameraFeed.port()       //Returns the port of the audio feed as an Int.
  }
}

The external feeds can also be enabled/disabled by the click of a button in the web-interface. They can be found under Settings > External Feeds.

Camera usage

When the camera feed is enabled, a url will be provided. The url leads to a ZMQ.SUB socket where a stream of images and metadata is published. After each JPEG image (binary string) follows a JSON-formatted object containing annotations of that image. The annotation object contains a timestamp (unix epoch time) and a users array, containing information about each of the detected users. For each user, the following information is provided:

{
      "id":1,
      "bbox":{
        "x":..,
        "y":..,
        "w":...,
        "h":...
      },
      "landmarks":[x0,y0,x1,y1,x2,y2...],
      "pos":{
        "x":x,
        "y":y,
        "z":z
      },
      "rot":{
        "pitch":p,
        "yaw":yaw,
        "roll":roll
      },
      "emotion":{
        "hap":x,
        "sad":y,
        "ang":z,
        "sur":w,
        "neu":n
      },
      "faceprint":[...]
}

Additionaly, a timestamp is added to the the JPEG image itself as EXIF data. To read the meta data out of the image you can use any library that reads EXIF data out of a jpeg image. The meta data is the User Comment with a json object that has a single field timestamp in unix epoch time. This can be used to ensure that the image and annotation data belongs together.

Example of how to connect a Furhat skill to an external computer for image processing and display of annotated camera images can be found here - see the object recognition tutorial.

Audio usage

When the audio feed is enabled, a url will be provided. The url leads to a ZMQ.SUB socket where an audio stream is published.

The audio is encoded as signed, 16 bit, big-endian linear PCM format, 16KHz, stereo (same format as WAV-files are encoded). The input audio (microphone) is in the left channel, and the output audio (speech synthesis) is in the right channel.

Example of how to use the Furhat audio feed can be found here - see the audio streaming tutorial.