Linked by Thom Holwerda on Fri 7th Oct 2011 20:48 UTC
PDAs, Cellphones, Wireless I don't think I've ever seen this before, but please correct me if I'm wrong. Samsung anf Google were supposed to unveil the Samsung Nexus Prime with Android Ice Cream Sandwich next week, but in a surprise announcement, the companies said that the press event is cancelled - out of respect for Steve Jobs. In the meantime, leaked specifications reveal that the Nexus Prime could be a real doozy.
Thread beginning with comment 492394
To view parent comment, click here.
To read all comments associated with this story, please click here.
Tony Swash
Member since:
2009-08-22



Raise your hand if you have extensive experience dealing with Apple's PR dpt.

*raises hand*

Can't say more.


I guess that means you don't have any evidence.

It's painful to watch people who claim to be interested in technology not get Siri. To use the term 'voice recognition system' in relation to Siri just underlines how deeply you have missed the point. Voice recognition is not trivial but it's not new. Siri is not a voice recognition system it's an AI system.

Eventually you will catch on and then when it changes everything you can start writing article about how Apple didn't event it ;)

Reply Parent Score: 1

Thom_Holwerda Member since:
2005-06-29

I'm sorry, more than my word I cannot offer. It's pretty much a given though - you can't critisize Apple and still get early access to their stuff for reviews, or press invites. That's how Apple keeps a tight grip on the press, and ensures all early reviews are positive. I have dealt with Apple about this a lot, but of course, it's all confidential. I can undersrand you won't believe me, that's fine.

As for Siri - you just proved my point. You haven't used it, have no idea how it works, yet you automatically believe it's perfect and will change the world. You're a believer in the church of Apple.

I'm not. I'm a sceptic, with everything (except Fiona Apple). I want to actually use it first in a real-world environment. Then I'll judge.

And you're right, Apple did not invent Siri. They bought it.

Reply Parent Score: 3

Tony Swash Member since:
2009-08-22

I'm sorry, more than my word I cannot offer. It's pretty much a given though - you can't critisize Apple and still get early access to their stuff for reviews, or press invites. That's how Apple keeps a tight grip on the press, and ensures all early reviews are positive. I have dealt with Apple about this a lot, but of course, it's all confidential. I can undersrand you won't believe me, that's fine.

As for Siri - you just proved my point. You haven't used it, have no idea how it works, yet you automatically believe it's perfect and will change the world. You're a believer in the church of Apple.

I'm not. I'm a sceptic, with everything (except Fiona Apple). I want to actually use it first in a real-world environment. Then I'll judge.

And you're right, Apple did not invent Siri. They bought it.


So you are backtracking. You said people were threatened with being cutoff now you say Apple gives product previews to preferred reviewers - duh who doesn't?


If you are confused by Siri you might want to have a look at this - the pedigree is extremely impressive.

http://9to5mac.com/2011/10/03/co-founder-of-siri-assistant-is-a-wor...

Forget you prejudices and embrace the new wherever it comes from. Phobias are so limiting.

Reply Parent Score: 2

Neolander Member since:
2010-03-08

"Siri is not à voice recognition system it's an AI system"

(Disclaimer : Although I believe I have the required knowledge of physics, signal theory, and programming, I have never worked directly on a voice recognition system. So anyone who has, please correct me if you detect some bullshit in the upcoming post)

So you believe that it is possible to make a decent voice recognition system without AI ? I don't think so, and am going to explain why.

What is voice recognition ? Basically speech to text translation. Basic theory is that you take an audio file or stream of someone saying something, you isolate words and detect punctuation based on the pauses and intonations of the talk, then you take each word separately and try to slice it into phonemes, which are pretty close to syllables but not quite the same thing. From phonemes, you can get the textual word. (to be continued, stupid 1000 char phone browser limit)

Edited 2011-10-10 16:47 UTC

Reply Parent Score: 1

Tony Swash Member since:
2009-08-22

"Siri is not à voice recognition system it's an AI system"

(Disclaimer : Although I believe I have the required knowledge of physics, signal theory, and programming, I have never worked directly on a voice recognition system. So anyone who has, please correct me if you detect some bullshit in the upcoming post)

So you believe that it is possible to make a decent voice recognition system without AI ? I don't think so.

What is voice recognition ? Basically speech to text translation. Basic theory is that you take an audio file or stream of someone saying something, you isolate words and detect punctuation based on the pauses and intonations of the talk, then you take each word separately and try to slice it into phonems, which are pretty close to syllabs but not quite the same thing. From phonems, you can get text. (to be continued, stupid 1000 char phone browser limit)


Honestly - the lengths some people will go, people who claim to be genuinely interested in technology, to argue absurdities just so that can belittle something Apple is doing. Do you really believe any that tosh you just wrote?

Clearly speech recognition software recognises words. It may have attached to it a programme that can recognise set phrases and connect those set phrases to an object. That is impressive but you know as well as I that such software is very limited and that it is a very stupid system.

What Siri does is listen to what you are saying and then infer from the context of the conversation what phrases might mean. It seems to do this an order of magnitude better than anything else out there let alone anything on a phone. So if you are having a conversation with Siri about two appointments clashing you seem to be able to say something like 'move it to the next day' and (like a human could) Siri will know what 'it' is and what the next day is and what moving 'it' means all from the context of the conversation you having with it. If it works as claimed, and those commentators with a hands on experience say it does indeed seem to work as claimed, then Siri is very, very impressive and might well represent a true step forward in the way humans interact with technology.

So as I said if people who claim to be interested in technology want to argue that it is trivial just because it is attached to Apple well more fool them. The only way to lose a limiting phobia is to stop being afraid of the phobic object.

Reply Parent Score: 2

Neolander Member since:
2010-03-08

Now, what are the problems which explain why computers took so much time to get this speech to test translation relatively right ?

First, there is the [word sliced in phonemes]->[written word] translation. It is not as simple as it looks, because many European languages have this "feature" that there are several ways to write a given phoneme. If you go in Asia, things are even worse : words are not commonly spelled using syllables, but using more complex characters which are often also words in their own right.

For all these reasons, voice recognition systems need an internal dictionary to associate a bunch of phonemes with a written word.

As a starting point, someone who wants to create such a dictionary can use a regular dictionary, take the phonetic expression of each written word, and create a phonetic-to-written dictionary from that. But if you stop at this stage, you'll miss all the everyday familiar vocabulary that is not officially recognized by national dictionaries, such as weasel words. These words, along with other things which are not found in dictionaries (such as the names of numbers, letters, and mathematical symbols) must be added manually.

Manually adding words that are not in the dictionary takes a lot of time and effort, and developers cannot think of everything, so some words will always end up missing. Especially taking into account that our vocabulary is in constant evolution. For this reason, good voice recognition systems must be able to learn new words. Which is a first form of AI.

Edited 2011-10-10 17:05 UTC

Reply Parent Score: 1

Neolander Member since:
2010-03-08

Second, there are homophones. Two different words which are pronounced in the same way. These are very frequent in the basic French vocabulary, I don't know what the situation is in English.

How does a voice recognition system discriminate between both ? It can use two tools : the frequency at which a word is used (when in doubt, the most frequently used word is the safest bet), and structural analysis of the sentence to check which word it is most likely to be.

As an example of the second form of discrimination, in French we have "a", which is the present form of the "avoir" (to have) verb, and "à", which is used to introduce location complements in a sentence. Both are extremely common. To discriminate between both, the voice recognition system could check the sentence for the presence of a verb. If there is none, then we are most likely talking about "a".

I hope that it is obvious that both word frequency analysis and sentence analysis are operations that are best adapted to each individual user, who has a different way to speak. So we need learning, so we need AI here too.

Reply Parent Score: 1

Neolander Member since:
2010-03-08

So far, I have assumed that there is a unique way to pronounce a given word across all countries which speak a given language. This is, of course, perfectly false. Regional differences are strong, to the point where even humans sometimes have a hard time understanding each other.

As an example, having mostly learned British and American English, I have a hard time communicating with Indian people. I know the words, just not the pronunciation. In French, some people pronounce "é"s the way I pronounce "è"s, some people pronounce letters which I don't pronounce in words, and vice-versa. Words have a different meaning and are used in different contexts. In fact, even the way punctuation is introduced in a spoken sentence can subtly vary.

A voice recognition system must adapt itself to this. Since we generally only specify what is the language we're speaking, and not the regional variant, it has guess which regional variant we are using, and remember that. If it doesn't know about our regional variant, it must also get used to it. Again, our voice recognition system learns, so that's AI.

Reply Parent Score: 1

Neolander Member since:
2010-03-08

What about individual differences in pronunciation ? Even in a given region, people talk in a different way, depending on the life they have lived. Some people speak slowly, other go very fast. Some people use a very formal vocabulary, when other are very familiar in their everyday speak. The voice recognition system must adapt itself to these different behaviors if it wants to have optimal performance.

Then even for a single individual, pronunciation varies depending on the circumstances. You speak differently when you're tired, when you're running, when you're in a meeting, when you're troubled, when you're in shock... Again, a voice recognition system must adapt itself to that. Thus, more AI.

Reply Parent Score: 1

Neolander Member since:
2010-03-08

I could go on and on about detecting phonemes in a noisy environments, people who "eat" phonemes when they speak too quickly, neologisms, context sensitivity and the languages that are heavily based on that such as Japanese, and so on, but I hope that at this stage you see my point.

Many people, in which I believe you are included, think that voice recognition is simple. This feeling comes from the fact that we do it everyday, in a relatively painless fashion, only asking people to repeat what they just said infrequently. The truth is, it is not, and there is a reason why children take so much time to get a rich vocabulary.

Voice recognition is a fantastically complex problem, whose complexity probably borders that of translating one language to another. It is not only a problem of processing power, but also of gathering the required knowledge in a way that is accessible to a computer program. AI gathers knowledge from where it is most useful, the user, and makes use of it to improve the recognition quality, so it obviously a vital part and has been there for ages. In academia, I am ready to bet that voice recognition is mostly studied in AI labs, in the same kind of team that works on automated translation.

Saying that "Siri is different from voice recognition because it is an AI" is thus deeply, totally wrong. Voice recognition IS AI. Slapping stuff after it which processes the extracted text, like a WolframAlpha backend that can find answers to an oral question, is certainly a nice touch, could qualify as an interesting integration effort, but is by no means the revolution you want to make it be.

Edited 2011-10-10 17:42 UTC

Reply Parent Score: 1

Tony Swash Member since:
2009-08-22

What can I say guys - watching otherwise intelligent people twisting themselves in absurd knots just so they can that something important that Apple has done is trivial is embarrassing and unnecessary. You just have to let go of a phobia and your freedom of action and thought is immediately increased. Just try it.

As I said before if, as is likely, Siri turns out to have been a major inflection point in tech development, it is sadly predictable that you guys will be the first trumpeting about how Apple didn't invent it. It's all so tiresome.

Reply Parent Score: 1