Linked by Eugenia Loli on Wed 9th Aug 2006 05:12 UTC
Windows "Surprise surprise. Windows Vista speech recognition actually works. Contrary to what MSNBC criticize as a ‘wreck', the speech recognition technology is well developed and highly usable, says Long Zheng.
Order by: Score:
Wow
by ValiantSoul on Wed 9th Aug 2006 05:57 UTC
ValiantSoul
Member since:
2005-07-20

I typically don't say this about Microsoft products, but this is pretty cool. I like the ideas about when multiple options are available and the screen coordinate system.

Anyone know if OS X has somethign similar? (Never looked into it - just curious)

Reply Score: 2

RE: Wow
by REM2000 on Wed 9th Aug 2006 08:23 UTC in reply to "Wow"
REM2000 Member since:
2006-07-25

yeah the mac has the same functionality in Tiger, not sure about previous versions.

Good demo, glad to see Vista is starting to pull together, i hope Microsoft release another public beta (i know they said they will) soon, i also hope it's a lot better than Beta 2 as i wasn't impressed.

Reply Score: 2

RE[2]: Wow
by Alleister on Wed 9th Aug 2006 10:28 UTC in reply to "RE: Wow"
Alleister Member since:
2006-05-29

The catch with Tigers speech recognition is, that it is only available in english. Of course that doesn't matter if you an native english speaker but i'm happy that i'll get german speech recognition in vista which is something Apple has no plans for. There will also be a chinese version. Dont know thou what other languages are planned.

I have not tryed myself, since i wont install beta software on my work-box, but a friend of mine tryed it and claims that the german speech recognition works quite well.

Reply Score: 3

My own experiences..
by The Lone OSer on Wed 9th Aug 2006 06:24 UTC
The Lone OSer
Member since:
2005-07-11

Alas, I find Speech Recognition a rather annoying technology. Having first started with it with IBM ViaVoice that was built in to OS/2 Warp, and then ViaVoice via Windows, and now Vista - My experiences alas, are exactly the same, even 10 years on with the technology.
I stem from a little country known as New Zealand.. Population - 98 million (94 million of them being sheep), and i find that, although we speak the Queens English (apparently) - the technology fails misserably with our accent, and you can get a hilarious response with the dictation systems... With Vista - when trying to train it, the software would lock solid every single attempt.
I wait with anticipation of a build of Vista I can actually train - and then we will see if this 10 year old problem persists ;)

Reply Score: 5

RE: My own experiences..
by Alleister on Wed 9th Aug 2006 10:05 UTC in reply to "My own experiences.."
Alleister Member since:
2006-05-29

Sorry, but when i see an film from New Zealand even my biological speech recognition fails miserably. There would have to be a specialized version for New Zealand.

Reply Score: 3

RE: My own experiences..
by RawMustard on Wed 9th Aug 2006 11:31 UTC in reply to "My own experiences.."
RawMustard Member since:
2005-10-10

Perhaps when kiwis learn to pronounce english correctly, speech recognition software may have a slight chance at working properly. Being married to one, I'm constantly asking her to repeat what she just said. I'm afraid that Fushun chups or suxdi sux don't count as proper english ;)

Edited 2006-08-09 11:38

Reply Score: 5

RE[2]: My own experiences..
by iangibson on Wed 9th Aug 2006 15:33 UTC in reply to "RE: My own experiences.."
iangibson Member since:
2005-09-25

So what is the 'correct' pronunciation of English? Let me suggest that it is certainly not the way Americans speak.

Not that there is a 'way' - there are many regional accents across the USA just as there are in Britain and elsewhere across the English-speaking world.

For speech recognition software to be remotely useful, it surely must be able to accommodate all of the many accents that are out there.

Reply Score: 1

RE[3]: My own experiences..
by Alleister on Wed 9th Aug 2006 15:43 UTC in reply to "RE[2]: My own experiences.."
Alleister Member since:
2006-05-29

Well, that is only possible to a certain degree. On the other hand, you can't expect an speech recognition system to understand spoken language better than a human.

I don't know how bad the dialects are inside of america, but there are german dialects even i as an german do not understand.
So people who want to be understood by their software will have to try to speak with as few dialect as possible, because a computer wont do better work than a human on this.

Reply Score: 1

RE[2]: My own experiences..
by martinus on Wed 9th Aug 2006 11:39 UTC in reply to "My own experiences.."
martinus Member since:
2005-07-06

"open the bloody start menu, mate!"

Reply Score: 4

dear aunt
by postmodern on Wed 9th Aug 2006 06:33 UTC
postmodern
Member since:
2006-01-27

Well it DID crash as we all saw in the video clip from MSNBC, and it was certainly a train wreck (as indicated by the audience's chuckling). Glade to see they got the bugs worked out now. Although, Mr. Zheng never tried to write a letter to his aunt...


let's set so double select killer delete all!

Reply Score: 1

Just one word
by audun on Wed 9th Aug 2006 06:37 UTC
audun
Member since:
2005-07-13

Impressive

Reply Score: 1

Reminds me of something
by Lambda on Wed 9th Aug 2006 06:46 UTC
Lambda
Member since:
2006-07-28

Well I could only get the audio from my QT plugin, but listening to it reminded me of Blade Runner, when he's sitting at home, studying the pictures he got from Leon's apartment, and giving the picture viewer voice commands.


The future is here;)!

Reply Score: 2

The video worked the second time around
by Lambda on Wed 9th Aug 2006 06:58 UTC
Lambda
Member since:
2006-07-28

I'm impressed. I haven't really been following the progress of speech recognition, but if the video is any indication, this could be the "killer app" for Vista.

Reply Score: 1

sbenitezb Member since:
2005-07-22

"this could be the "killer app" for Vista."

I doubt. Speech recognition is a pain in the ass. It's mostly for impaired people, not for everyone day to day use.

We are really far from being able to tell the computer:

* Computer, read my mail; discard all spam.
* Computer, reply to Joe Sixpack: Dear Joe, this voice interface like Startrek's ships computers is awesome. You should check it out.
* Computer, send the composed mail.
* Computer, preview the picture of my last birthday, the one I'm with my girldfrind, the new one. Also send a copy to her and print one in high resolution. Make sure Mom gets a copy.

*That*, would be nice.

Reply Score: 4

Lambda Member since:
2006-07-28

We are really far from being able to tell the computer:

* Computer, read my mail; discard all spam.
* Computer, reply to Joe Sixpack: Dear Joe, this voice interface like Startrek's ships computers is awesome. You should check it out.
* Computer, send the composed mail.
* Computer, preview the picture of my last birthday, the one I'm with my girldfrind, the new one. Also send a copy to her and print one in high resolution. Make sure Mom gets a copy.



Did you watch the screencast? It looks like 1-3 of your list is completely doable in the Vista system as of now. #4 is more of an AI/Image recognition problem.

Reply Score: 3

sbenitezb Member since:
2005-07-22

"Did you watch the screencast? It looks like 1-3 of your list is completely doable in the Vista system as of now. #4 is more of an AI/Image recognition problem."

Yes. You must be kidding. You cannot use natural language normally to speak to your PC without repeating or making sure it understands exactly what you mean. It's a waste of time. It's just a toy.

Reply Score: 0

Lambda Member since:
2006-07-28

It's a waste of time. It's just a toy.

Or maybe you wish it was a waste of time and just a toy.

Reply Score: 1

sbenitezb Member since:
2005-07-22

"Or maybe you wish it was a waste of time and just a toy."

If you read my previous post you will find the answer.

Reply Score: 1

Lambda Member since:
2006-07-28

"Or maybe you wish it was a waste of time and just a toy."

If you read my previous post you will find the answer.


Your "answer" doesn't coincide with reality and I doubt you even watched the video. Don't worry, Linux will get it eventually and you can then proclaim it's the greatest thing since sliced bread.

Reply Score: 2

sbenitezb Member since:
2005-07-22

"Your "answer" doesn't coincide with reality and I doubt you even watched the video. Don't worry, Linux will get it eventually and you can then proclaim it's the greatest thing since sliced bread."

You like cheap talk, don't you? I watched the video. Even if it wasn't made up, it's not that interesting technology for what it can actually do. And as I'm not interested in useless things (at least useless to me) I really don't care if Linux ever has it. It works for me now, without crap.

Reply Score: 2

47ronin Member since:
2006-04-03

As long as the mail application is scriptable, almost all of these items have been doable as speech recognitions commands for years in Mac OS 9 (we're talking at least seven years). I was able to login using a vocal phrase key, open and close applications, scroll windows and select items, get information and run menu items. It was a combination of the built-in AppleScript and Speakable Items features of Mac OS back in the day.

One thing I am curious about... does Vista require you to voice-train your computer BEFORE the system can recognize you properly or does it work out of the box with any Joe's accent and drawl?

Reply Score: 1

Linux...
by marcos89 on Wed 9th Aug 2006 07:07 UTC
marcos89
Member since:
2006-05-08

http://perlbox.org/Download.html

Any future with this?? Or maybe another solution??

Reply Score: 0

I don't believe
by ActiveMan on Wed 9th Aug 2006 08:06 UTC
ActiveMan
Member since:
2006-01-15

This demo was prepared in a special enviroment without any noise, isn't it?

You can see what happen in the real world here:

http://www.youtube.com/watch?v=fV1kqthZf2g&NR

Reply Score: 4

RE: I don't believe
by Alleister on Wed 9th Aug 2006 10:16 UTC in reply to "I don't believe"
Alleister Member since:
2006-05-29

Yes, because in the real world you always sit with a couple of thousand guests in your home cathedral.

Don't get me wrong, this surely isn't anything spectacular new, but it reflects my experiences with other speech recognition software. Usually it was enough to use a different microphone than that which you have trained it with and they stoped working.

You can be sure that no MS employee would present that software if it would not have worked when they tryed it. You know, you can think of Microsoft what you want, but there is no policy that requires MS employees to have down-syndrome.

Reply Score: 5

RE[2]: I don't believe
by apoc on Wed 9th Aug 2006 12:24 UTC in reply to "I don't believe"
apoc Member since:
2006-03-24

lol, ActiveMan, if you don't believe try it for yourself, it's the best thing to do.

I had already tried Vista's Speech Recognition before that sad video, "dear aunt etc etc", so i know that there was a problem at the demo, if you watch it carefully you'll notice the volume is at the max, and the guy giving the commands didn't know how to use speech recognition, when a recognition error occurs the speech button turns orange and you have to wait for it to turn blue again, if you keep talking while it is orange you'll only get things worse because Speech Recog will think that you're still finishing the command when you're actually trying to correct re recognition error.

Try it for yourselves.

I also don't agree that the speech recognition is only for impaired people, it's much faster saying "start AppName" than typing the name of the app in the start menu, you don't even have to open it! Also, simple things like clicking buttons are much faster, imagine you're composing an email, you change your mind and just say "close that, don't save",

minimize
maximize
switch applications
open apps/common places(documents|pics|videos|computer|other start menu buttons)
start a search("start search")
shudown/hibernate/sleep/log off
menubar commands("tools options")

these are all tasks that i believe speech recognition will help to speed up.

btw, the new explorer is also pretty nice interms of usability, the new address bar, organization options, visual feedback of folder contents, preview pane with multimedia support, metadata below, integrated search,it's nice.

Reply Score: 3

RE: I don't believe
by Janus on Wed 9th Aug 2006 13:10 UTC in reply to "I don't believe"
Janus Member since:
2005-07-20

And you can see the whole clip here:

http://www.youtube.com/watch?v=kX8oYoYy2Gc

Unlike in the clip you posted, he successfuly does application launching and switching betwheen windows. Also, after the part where the voice recognition fouls up, he removes the text and starts anew dictating a letter without any problems. But of course, these part don't help when you're out to bash Microsoft. ;-)

Anyhow, as described pretty detailed in the YouTube description of that clip, it was a known bug and it hs been resolved.

If you want to know exactly what caused it, you can read all about it here:

http://blogs.msdn.com/larryosterman/archive/2006/07/31/684327.aspx

Reply Score: 5

Impressive!
by Babi Asu on Wed 9th Aug 2006 09:19 UTC
Babi Asu
Member since:
2006-02-11

I hope Apple bring that to Leopard. I don't care Apple will be called copycat, it's Apple's problem. ;)

Reply Score: 1

it really looks cool...
by RandomGuy on Wed 9th Aug 2006 09:25 UTC
RandomGuy
Member since:
2006-07-30

but does anybody know how it handles different languages?
There are times when I need to write German (my friends would give me very confused looks if I wrote all my mails in English) as well as times when I need to write English. Is there a simple command like "switch language to X"?

I think it's really good for common task although I would not want to program this way.

Reply Score: 2

RE: it really looks cool...
by n4cer on Wed 9th Aug 2006 18:18 UTC in reply to "it really looks cool..."
n4cer Member since:
2005-07-06

but does anybody know how it handles different languages?

It currently supports 8 languages/dialects:
U.S. English, U.K. English, traditional Chinese, simplified Chinese, Japanese, German, French and Spanish.

Reply Score: 1

challenged
by netpython on Wed 9th Aug 2006 09:48 UTC
netpython
Member since:
2005-07-06

I think voice recognition is major step forward for psysically challenged computer users.Really impressive stuff.Like a helping dog that turns of the light switch.An direct increase in quality of life.

But with all respect for the majority of us its just bling bling in the current stage.It starts to become interesting when voice recognition reaches the sophistication of computers in the sci-fi series of Star Trek.

Yet again how nice,its still far far away MS show and nobody know when,how and *if* it will be delivered.

Edited 2006-08-09 09:49

Reply Score: 1

And This Means What?
by segedunum on Wed 9th Aug 2006 10:42 UTC
segedunum
Member since:
2005-07-06

Am I going to believe a blog entry or an on stage demonstration that went belly up and where the recognition could recognise the most basic of words - with no background noise?

Reply Score: 1

RE: And This Means What?
by segedunum on Wed 9th Aug 2006 11:42 UTC in reply to "And This Means What?"
segedunum Member since:
2005-07-06

where the recognition could recognise the most basic of words

Whoops. That should of course read 'where the recognition couldn't recognise the most basic of words'.

Reply Score: 1

And here in the real world
by bolomkxxviii on Wed 9th Aug 2006 10:51 UTC
bolomkxxviii
Member since:
2006-05-19

I have tried ViaVoice and Dragon Naturally Speaking. Went back to typing. It is faster and more accurate.

Reply Score: 1

RE: And here in the real world
by netpython on Wed 9th Aug 2006 11:00 UTC in reply to "And here in the real world"
netpython Member since:
2005-07-06

Went back to typing. It is faster and more accurate.

Exactly!

And i think a real graphics artist can't get his/her "fingerspitzengefuehl" substituted by voice recognition that easily anyway.

Reply Score: 2

My experience
by Isolationist on Wed 9th Aug 2006 11:10 UTC
Isolationist
Member since:
2006-05-28

I am Vista's delete using leech recognition as oui peak help

Reply Score: 1

v wreckignicion
by Sabz on Wed 9th Aug 2006 11:24 UTC
Since 1999...
by doubleUb on Wed 9th Aug 2006 11:32 UTC
doubleUb
Member since:
2005-12-08

Microsoft speech recognition is available for free download, with SDK.

I was programming in Delphi under Windows Me at the time, it worked well for speech command. I remember having a small merlin on top left of my screen, wiating for commands like "switch delphi" to bring up Delphi when it was not on top.

The only matter for me was that it was English only (even if I could have free french speech synthetisys)

Since then... Nothing happened. I am pretty sure the speech recognition engine is still the same. I have been waiting for that for years.

Now, I have a graphic tablet with shortcuts. Definitely the best HID device I ever had.

Reply Score: 3

really impressive
by Yagami on Wed 9th Aug 2006 11:39 UTC
Yagami
Member since:
2006-07-15

the video really impressed me !!! i must say , the computer understood much better what he was saying than me ! ( i always though he was saying closeup )

Reply Score: 1

ABM'ers
by makfu on Wed 9th Aug 2006 12:31 UTC
makfu
Member since:
2005-12-18

This is an impressive demo. Alas, no matter how impressive Vista might be, the ABM'ers will say anything to justify their irrational vitriol of Microsoft.

Reply Score: 2

RE: ABM'ers
by Lambda on Wed 9th Aug 2006 16:15 UTC in reply to "ABM'ers"
Lambda Member since:
2006-07-28

This is an impressive demo. Alas, no matter how impressive Vista might be, the ABM'ers will say anything to justify their irrational vitriol of Microsoft.

Exactly. I was actually surprised at how well it worked and how integrated it was into the environment. If this worked as well in a Gnome or KDE desktop it would be the greatest thing since sliced bread.

These people are so transparent. They're bitter it did work so well.

Reply Score: 2

Voice Recognition
by brother bloat on Wed 9th Aug 2006 12:36 UTC
brother bloat
Member since:
2005-07-06

The problem with voice recognition, in my opinion/experience, is that, while it may be reliable at times, it's almost never reliable 100% of the time. To me, as a user, I'm used to communicating with other human beings, who can understand what I say even when I mumble, or when I'm standing in a crowded room with the music blaring.

When the computer fails to reach the level of reliability I'm used to with other humans, I find myself unable to rely on the computer for voice recognition at all as a tool. Instead, voice recognition becomes slower than simply clicking my mouse and typing on my keyboard -- i.e. what I've always done (and gotten used to) in order to interact with the computer.

Speaking to the computer is like wading through thick oatmeal; while it should be easy and effortless, I find myself becoming annoyed when simple tasks take just a little bit longer.

Until voice recognition reaches (or comes very close) to the level of human voice interactions (perhaps through image recognition via the more and more popular webcam for lip reading??), I'm staying away. To me, until then, voice recognition will be a neat gimmick or a toy, but it's not going to cut it for "real" work.

Reply Score: 2

That's nice.
by Sphinx on Wed 9th Aug 2006 13:36 UTC
Sphinx
Member since:
2005-07-09

I had speech recognition working on my 8086/CPM powered apricot portable in the 80's, blew a few minds but that's about all, not a great time saver, just a neat toy.

Reply Score: 2

speech recognition is a disaster
by ozonehole on Wed 9th Aug 2006 14:03 UTC
ozonehole
Member since:
2006-01-07

I realize that it could be of use to the physically impaired, but for the vast majority speech recognition as it's currently used is a disaster. I absolutely dread calling Qwest (a US phone company) or my credit card company, or numerous airline companies, because I've got to wade through menus where a computer asks me questions that I'm supposed to answer. Only it misinterprets the answers half the time and directs my call to the wrong place. I irrationally find myself yelling at the computer.

I would say that, at least in America, the quality of life has deteriorated somewhat thanks to speech recognition, and it continually gets worse as more companies and government offices adopt this "labor saving technology". I wonder when the Diebold election rigging machines are going to start using speech recognition.

Remember the old days when actual humans answered the phones? I'm almost grateful for those call centers in India - just about the only chance you've got to reach a real person (though usually only after you've spent 10 minutes yelling at a computer).

Edited 2006-08-09 14:07

Reply Score: 3

Nice app
by michael135 on Wed 9th Aug 2006 14:21 UTC
michael135
Member since:
2006-06-21

Speech recognition works fine, when the application compares the input to a limited number of entries, like in this video. Speech recognition becomes much harder when the user is allowed to speak freely, like when dictating an email.

Reply Score: 1

RE: Nice app
by MollyC on Wed 9th Aug 2006 15:54 UTC in reply to "Nice app"
MollyC Member since:
2006-07-04

He does dictation at the end of the video, and it worked fine.

Reply Score: 1

Evil
by agentj on Wed 9th Aug 2006 14:33 UTC
agentj
Member since:
2005-08-19

Shout "open cmd.exe - enter - format /autotest /u c:" ]:-D

Reply Score: 1

RE: Evil
by IMesh on Wed 9th Aug 2006 15:22 UTC in reply to "Evil"
IMesh Member since:
2006-06-08

Oh I get it, like your going to format the Windows partition! Arn't you clever. That's almost as cool and clever as typing MS.

Reply Score: 0

Exchange 2007
by REM2000 on Wed 9th Aug 2006 15:42 UTC
REM2000
Member since:
2006-07-25

Another side point to make is that Exchange 2007 also has speech recognition. Microsoft demo'd this a few months ago at a technet UK event.

It was one of the most impressive thing ive ever seen (or should that be heard ;) . The Microsoft employee was able to call up the exchange server via a mobile phone, tell it to open her inbox, calendar etc.. change appointments, send voice mails etc.. all via her voice without a single key-press on the mobile.

It's nice to see this technology finally start lifting off the ground.

Reply Score: 2

Impressive
by ThawkTH on Wed 9th Aug 2006 16:42 UTC
ThawkTH
Member since:
2005-07-06

I'm usually quite anti-Microsoft. I've, on more than one occassion, accused vista of being a prettified XP. Nothing more.

I've begun to see the error in my thinking. Do I think Vista's as different and groundbreaking as it should be after 5 years (+)?

Not at all.

This I do find impressive. Even if most people don't use it. Even if it is a toy once in a while. It's different and appears to work extremely well.


Now someone needs to come up with something total nerds would totally love - a tablet pc with a decently designed and integrated LCars interface running Linux with awesome speech recognition and voice responses.
"Computer, status."

"Fetching updates. 1 minute remaining."

Hey, I've been dreaming of it since I was 10!

Haha, seriously though, I think it would be cool to have some verbal responses - I'd just choose the trek lady voice I learned to love watching ST:Voyager. It would go a long way to integrating the computer into the home rather than having it be "just another appliance" that someone opens for e-mail once in a while.

/end ramble.

Reply Score: 1

RE: Impressive
by 47ronin on Wed 9th Aug 2006 16:52 UTC in reply to "Impressive"
47ronin Member since:
2006-04-03

Already done.

http://en.wikipedia.org/wiki/Speakable_items

Mac OS X has Speakable Items which has some preset commands AND it opens up a world of possibilities because the speech recognition engine can tie into the Terminal. For example if you wanted the uptime of your system you can speak a command which would launch an AppleScript that parses the value of the "uptime" command and make it phoenetically speakable by the Mac.

http://www.xvsxp.com/misc/speech.php

Reply Score: 1

Warp
by Kancept on Wed 9th Aug 2006 21:15 UTC
Kancept
Member since:
2006-01-09

I've been using the voice regonition in OS/2 for years. My voice profile is quite large, and I get a VERY good recognition rate (above 90%, typically higher). I admit, I do have years invested into my speech profile, but I have carried it over each time so I do not need to start fresh. The profile I'd say has about 10 years of patterns to go on. I also have my custom assignments in it, like "send" and things that were mentioned above.

I would like to see a more fluid system instead of the boxyness when dictating, but I have grown accustomed to it after all these years. While I don't find it a neccessity, I do like the option of input methods. I don't see voice input as a waste at all.

Reply Score: 1

Wow... Amazing!
by proforma on Wed 9th Aug 2006 22:36 UTC
proforma
Member since:
2005-08-27

oh my gosh. Did we get through this thread without a "Microsoft and Vista blow" or "Linux rul3z for life!" or "Linux already has this feature and Microsoft did it horribly!"

I can't believe it!

Praise God!

Maybe the community here is actully buying a clue and moving on with their lives.

Imagine that!

Reply Score: 1