Sunday, 16 June 2013

Verbatim Speech to Text VSTTR (sometimes STTR).

Why am I writing a post about Verbatim Speech to Text?

My lipspeaking post was really helpful for me to write and seemed to generate positive responses from people. I want to raise awareness of a variety of communication support options available to assist deaf (and other where appropriate) people access spoken information as I think many of these things are not widely known about.

I am using the word deaf to mean anyone with a hearing impairment and or has any kind of difficulty accessing spoken speech sounds e.g auditory processing disorder etc.

What is Verbatim Speech to Text Reporting?

Verbatim Speech to Text Reporting (VSTTR) also known as "STTR" "STT" and palantypy (especially amongst UK deaf folk) derives from court reporting and uses the same technologies. I'm going to use VSTTR throughout this post to emphasise the verbatimness of this communication support option.

A highly trained operator uses a customised keyboard chorded input system to output realtime text of what is spoken. A UK AVTTR operator should be able to output 180wpm at an accuracy of at least 98%. Many are additionally qualified up to 200 or 250 wpm. However each word they output has to be put into their dictionaries so most mistakes will be custom words.

Palantype and Stenography are different input systems which produce similar/equivalent output.

What terms are used in North America?

CART (Communication Access Real-Time Translation) is the most common term I've seen used in the USA and Canada.

Others include open captioning and realtime stenography.

What does VSTTR look like?

When used for large conferences with more than one anticipated user VSTTR output should be on a large screen that is visible to everyone in the room. Anyone who attended BiCon 2010 plenaries would have seen VSTTR in action. The London 2012 Olympics also had some VSTTR (Scroll to about 4/5 of the way down the page).

For use by individuals or those wanting more discretion AVSTTR output can also be transmitted to individual devices like laptops, tablets and smartphones.

I can't find any UK VSTTR videos and the only US ones I can find aren't themselves subtitled which is offensively annoying.

  • CART company demo video 2min47s
  • Example of someone having CART for a meeting 4min28s.
  • There are also other slightly different ways of presenting VSTTR output so as overlay onto a video screen/slides or similar which only shows 3 or so lines at a time which some people prefer. This tends to be known as "captions" but not always so do be careful to check what the output options are.

    What is VSTTR useful for?

    VSTTR can be used for most situations where there is someone or a group of people speaking. Lectures and presentations are relatively simple to manage even with questions from the floor. Group discussions and seminars can also be managed but do need excellent sound, careful chairing with one person speaking at a time and speakers need to introduce themselves before speaking each time to ensure the VSTTR operator can label people individually.

    VSTTR isn't so ideal for times when a user needs to move around a lot say standing around small group chatting sessions where a lipspeaker or sign language interpreter could walk around with someone but I have seen VSTTR operators be creative about this too where needed.

    Like lipspeaking, VSTTR requires a decent level of English for the user to be able to follow it properly. It isn't a replacement for sign language interpreting as British Sign Language (BSL) is a different language. Some sign language users who have good English skills may find VSTTR useful so suitability will be down to individuals.

    Ultimately the choice to use VSTTR will depend on a number of factors, suitability for the deaf users, other communication support requested/provided, availability of VSTTR operators and things like available funding.

    There is some interesting work being done about realtime text access like VSTTR benefiting people who have English as a second language as it helps them understand spoken information ensuring they haven't misheard what was said thus consolidating their vocabulary. Much the same as subtitles on TV or DVDs.

    Professional bodies for VSTTR

    The professional body which represents VSTTR operators for deaf people is the Association of Verbatim Speech to Text Reporters (AVSTTR)

    VSTTR operators train for approximately 5 years, usually being court reporters first and then doing extra deaf awareness and technology courses before becoming NRCPD (National Register of Communications Professionals working with Deaf and deafblind people) registered in the UK and NCRA (National Court Reporters Association in the USA.

    You can request booking information via AVSTTR's website at no additional cost, complete the webform and that will get sent out to members. VSTTR operators also have their own websites which you can find lined from the main site with photos and contact them directly based on their listed region.

    You can also search for VSTTR operators on the National Registers of Communication Professionals (NRCPD) website at Tip, you have to provide a location before it'll let you select professional type in the next field.

    Other agencies I have used are Sign Solutions in Birmingham, BID in Birmingham and RNID now Action on Hearing loss but they will charge extra fees on top compared to going directly to AVSTTR or individuals.

    The VSTTR operators on AVSTTR all set their own fees which are about £200 per half day or £300 per full day plus travel/accommodation/subsistence if needed - they may need to charge extra if they travel a long way.

    Remote VSTTR

    In the last few years remote VSTTR has become available. This requires

    1. Good sound,
    2. Good Internet and
    3. Reliable technology.

    Basically the sound of the speakers is transmitted via the Internet using Skype or similar to the remote VSTTR operator who outputs the text which can be accessed via various secure or custom webpage environments for the user. This can be shown on a large screen or smaller device.

    121 Captions was the first UK remote VSTTR provider, run by Tina Lannin who is herself deaf. 121 also specialise in a number of overseas services and in 16 different languages.

    There is also Bee Communications run by Beth Abbott who I have worked with over the last year to pilot remote captioning for students at my workplace.

    Both 121 Captions and Bee Communications use VSTTR operators from all over the world and have slightly different focuses but having seen work by both I am impressed by the quality and standard.

    Remote costs seem to be about £70 per full hour and some providers will break down the cost into shorter chunks after an initial hour or two.

    Remote VSTTR has advantages in that you can book for shorter periods of time without paying for travel and other costs for the operator to travel so you're not restricted to half or full days and can get costs down. There is also added privacy/discretion for the deaf user, so they don't have a visible "support worker" in the room with them, they can be using a laptop, iPad/tablet or even smartphone to access the content. Remote VSTTR is also bookable at shorter notice than in-person VSTTR which is especially useful when the user is not always able to get their schedule in advance.

    However, there are disadvantages as well. The Internet *needs* to be good, preferably wired and reliable. WiFi is often not good enough or randomly goes unreliable and flaky (contention on the network etc). MiFi/4G dongles or devices can work sometimes but again are limited by reception. The chance of technical issues is a lot higher than in-person VSTTR in a new place with unfamiliar users.

    In general I will still aim for in-person VSTTR for whole-day events and places where I'm not familiar with the tech setup. Remote VSTTR is however very useful for students as it allows flexibility and discretion and if we're able to provide the students with MiFi dongles as well as university WiFi it seems to be usable for most students most of the time.

    How many STTR operators?

    In the UK we only have about 25 qualified VSTTR operators listed on the AVSTTR website. However there seem to be more in the US and Australia and timezone differences can increase availability so remote services opens up a wider pool of available operators.

    The AVSTTR operators work closely together to share vocabulary dictionaries and if using remote and not booking too late it may be possible to find operators with relevant specialist knowledge for the subject.

    For complex work; sessions of longer than a few hours total; or where 5-10 minutely breaks every hour or so are not possible it is recommended that two VSTTR operators are booked. This is about personal health and safety for the operator cos because their work is cognitively very demanding, they need breaks to rest their hands. You may need to think about structuring your event to allow for regular breaks which benefit a number of people and allow for concentration realities anyway.

    My views on VSTTR

    I *like* VSTTR, I think that's obvious from this entire post. For me it's ideal cos not only is it text which I do consider to be my first and preferred language but if it's full-screen output then there's a whole screen worth of text. It's like scrollback on tap! It solves the issue of access to the spoken content AND note-making as I have a transcript for use later if I want.

    With a speaker I can hear reasonably well I use VSTTR as a visual supplement to listening to and lipreading the speaker while glancing at the VSTTR output to ensure I've understood what is said and consolidate it in memory. My reading speed is *fast*. Access to VSTTR increases my stamina for spoken content drastically and means I have a good memory of what has been said and is going on. Where a speaker is harder to hear (i.e quietly spoken, male, or has an accent) I rely more on the VSTTR and don't struggle to lipread or hear.

    How I found VSTTR

    I was never offered VSTTR at university (it was uncommon in the UK in the early noughties although I knew of people who had it in the US as CART) and I suspect it'd have been difficult to arrange for classes. I also wasn't offered VSTTR in the workplace despite having 3 AtW assesments (Once in 2007 and twice in 2008/9) with assessors who knew BSL was not my first or preferred language. I wonder if my ability to understand and converse with my two deaf assessors in BSL (well more SSE speech and SSEish sign at the same time) meant they believed my sign was more durable and reliable than it is. I didn't use my AtW BSL budget allocated in 2008 because it didn't quite meet the need I had and it was a lot of effort to arrange and not possible at short notice. I am still often told that my BSL is better than I think it is - certainly reception for interpreted content of spoken information.

    Over the last 6 years in full-time employment I have gained a better idea of what my difficulties with hearing and processing speech are. I know it's endurance of about 60 mins without a break and 3 hours total before I get exhausted and dizzy. I also have a poor memory for things I have heard and struggle to remember the sense and meaning as well as detail of things delivered via conferences or presentation. In contrast, my visual memory for things in text is superb, which is why I like notes and being able to read as much in text as possible.

    I am hoping to get Access to Work (AtW) funded STTR for work conferences, meetings with people I don't know and webinars soon.

    Remote VSTTR and Webinars

    Webinars which are becoming more common in my field of Assistive Technology are extremely difficult for me to access as they tend to involve one or two speakers on a poor video link which isn't lipreadable from + a web chat which I can access. I find after only a few minutes of trying to follow speakers that I get headaches and feel unwell from the effort of understanding audio alone. Remote VSTTR will be ideal as the operator can log into the webinar and get good audio and I can follow the typed output on one of my work screens (I have two).

    Questions? and comments are welcomed

    If you have any questions, chuck em in the comments. Please tell me if I use words which you don't know, so I can go back and expand/define them if needed.

    Friday, 26 April 2013


    Why am I writing a post about lipspeaking
    I asked various real life and Internet friends if they knew what lipspeaking was and how it worked and more than 75% had no idea or were guessing. People said they would like a blogpost.

     I have to start out by saying I am not an expert on lipspeaking. I have known what lipspeakers are since 1995 but never used one myself or been around anyone using one. I am sharing information and knowledge that I have acquired from being on various online deaf communities. This post contains some of my contextualised opinions as well as information.

     I am using the word deaf to mean anyone with a hearing impairment and or has any kind of difficulty accessing spoken speech sounds e.g auditory processing disorder etc.

    What is a lipspeaker
    A lipspeaker is a person who has been specially trained to repeat a speaker's speech in a more easy to lipread way. Usually silently but they can use their voice on request.

    Lipspeakers may add fingerspelling, sign language, body language and other cues to their communication and will adapt their communication according to what the deaf client requests and needs.

    Where the original speaker is speaking very fast a lipspeaker may have to rephrase what is said and only repeat the more essential and salient points. A lipspeaker is supposed to be less than a sentence behind the original speaker.

     There are different levels of lipspeaking qualification. Level 3 seems to be the level required for professional lipspeaking.

    North American terms
    North Americans often use the term "speech reading" for what we UKers call lipreading.

    I now wonder what lipspeakers are called in America...? Googling didn't initially help so I asked humans on the Internet and got an answer in under 5 minutes.  Lipspeakers in America are called "oral transliterators".

    What does lipspeaking look like
    As of April 2013 there is only one 2min35s subtitled video on the brand new ALS website. The video is of a spoken voice which the filmed lipspeaker is lipspeaking for. This video is a deliberately staged and slightly exaggerated example of lipspeaking which is well designed to demonstrate the principles and concepts of lipspeaking.

    I am hoping (and have asked) if the ALS can upload some different short clip examples of lipspeaking in the wild and using different modifications like sign language and language modification.

    Lipspeaking in situational contexts Mileage is going vary hugely for how useful people find lipspeaking dependent on many factors such as:
  • When they became deaf e.g at birth or before they developed language, early childhood, adulthood or in old age.
  • Rapidity of deafness whether sudden or gradual.
  • Level of deafness.
  • Type of deafness.
  • Amplification choices - none, hearing aids, CIs, others.
  • Communication choices - speech, residual hearing, sign (BSL/SSE/SEE), cueing and more.
  • Education - level, quality, type, awareness.
  • Individual personality and preference and situation.
  • For oral deaf (people who communicate primarily with speech and use residual hearing possibly with hearing aids and cochlea implants) and deafened (those who have been hearing and become deaf) people lipspeaking enables them to access the tone, cadence, meaning and body language of the original speaker while accessing the clarity provided by the lipspeaker. Many people could not access this with text based communication options.

    A lipspeaker (like a sign language interpreter) can move around with a deaf person in situations like conferences and places where small groups of people are clustered around talking.

    A lipspeaker terping for a group may well be much much easier to follow as it's a single person rather than having to look around to find the new speaker and switch focus to them and their new lipspeaking patterns. It takes deaf people longer to realise a new speaker has started talking and "lock in" to the new speaker's speech patterns.

    To lipread English (or an other spoken language) effectively requires the deaf person to have a suitable vocabulary for understanding what has been said. Basically someone has to have the language and vocabulary in the first place to make use of this so it may not be suitable for a BSL user who does not have good English.

    Lipreading classes
    I am a huge fan of lipreading classes for people who are deaf, especially those who become deafened as I believe they teach a lot of very useful skills for coping with being deaf in the real world.

     Action on Hearing Loss's Lipreading page:

    I must sign up for some of those classes myself at some point.

    Association of Lipspeakers The professional body which represents lipspeakers is the Association of Lipspeakers (ALS) and their newly designed website is at

    This website is pretty comprehensive and worth a read in their own words.

    You can also search for lipspeakers on the National Registers of Communication Professionals (NRCPD) website at Tip, you have to provide a location before it'll let you select professional type in the next field.

    Many lipspeakers also have their own websites.

    Lipspeaking seems to cost approx £30-£40 an hour and may be charged in minimum increments of 2 hours, or by the half/whole day.  Two lipspeakers may be required for long or complex assignments.

    My thoughts on lipspeaking for me
    I think lipspeaking still requires the deaf person to concentrate a lot.

    Lipreading and using residual hearing is tiring and only about 35% of the English language is even possible to lipread from normal lip patterns alone. I do not yet know if this is higher for lipspeakers (using modifications and otherwise).

    Last time I was tested (artificial conditions, previous hearing aids) using single words and careful male audiologist spoken sentences:
  • Without lipreading my comprehension was ~60%
  • With lipreading my comprehension was 90% 

  • I found the lag on the ALS video somewhat disconcerting and am not sure that I would gain much from a lipspeaker that I don't get from lipreading most speakers for myself. In fact watching the video felt much like my few experiences of sharing someone's BSL terp where it was useful and gave me some extra info but extremely tiring to take advantage of as my BSL is fairly poor level 2 standard. 

    I think I use a lot of my energy processing audio, even with lipreading or sign language supplementation. This means my memory for audio information is poor. At work I have to make notes or I'll not remember what my students have said properly. 

    I have an excellent (as in frightening people by what details I recall) memory for information I have seen presented in text. 

    While I haven't experienced a true test of lipspeaking I don't think it is for me. However I think more people should know about it and evaluate whether it's something they would find useful and share this information with deaf and hearing people. 

    If you have any questions, chuck em in the comments. Please tell me if I use words which you don't know, so I can go back and expand/define them if needed. 

    I'll probably edit this post as I go along.