These are my notes from UX LX 2016, recently recovered from Evernote and therefore being published outside of their natural sequence. For the rest of the content, open the UXLX16 story page

@jonesabi

Humans respond to the volume they hear.

today we are going to cover our process of conversation, how we speak to machines and how to make machines with voice interfaces stronger.

rules that we follow everyday

when we walk down the street we don’t make eye contact with other humans.

making eye contact is a form of recognition and that we would like to engage in conversation.

steps

  1. recognition
  2. greeting
  3. initial enquiry

Grician maxims

  1. say what is true
  2. be as informative as required
  3. be relevant
  4. be perspicuous

Leaning about Grician Maxims (and how fictional narratives constantly break them) from @jonesabi #uxlx https://www.sas.upenn.edu/~haroldfs/dravling/grice.html

stephanieriegerTweet

Computers need to wait for us to stop talking in order to process what we said.

It then extracts tiny parts of speech, runs sound recognition to understand the natural language.

The dialogue manager keeps track of this process and identifies the differente elements of the sentence.

It then generates natural language.

Neural Networks are being used to improve this process.

Paris - France + Italy = Rome.

Basebal - Bat + Racket = tennis

there is a hidden layer that humans don’t have access to, and it processes this information to come up with the best probable outcome.

Coesion

Cohesion is the glue of discourse - cohen giangola, balogh

My kid only eats rice He can’t survive on that.

Computers can now understand pronouns.

Cadence

rythm, stress and intonation

For every word spoken to a computer, we need to give different levels of intonation

Context

Context is an agent’s understanding of the relationships between the elements of the agent’s environment.

— Andrew Hinton

Comprehensive

People will only use voice interfaces when they don’t need failure modes like the computer saying it can’t do that without asking for more information.

A conversation interface that really works needs to be able to match all these requirements.

Photos

recording of ‘how we talk, how machines listen’ 20160526 11:01:31.m4a