Hacker News

an hour ago by imedadel

Facebook's focus on unsupervised machine learning is a huge plus for under-resourced languages. They had a similar article for unsupervised machine translation[1] before, and I can see how it'll open doors for many African languages.

[1] https://engineering.fb.com/2018/08/31/ai-research/unsupervis...

an hour ago by savant_penguin

Not only that but it's also great for the English models as well. Many models are trained using the libre speech dataset with 960h of labeled audio.

What if you trained with 960000 hours instead??

2 hours ago by axiosgunnar

Hilarious how the Arabic „hello“ in that picture is completely mangled since they screwed up ltr rendering

(basically instead of hello it says „o l l e h“)

2 hours ago by sirfz

In the title image, the Arabic word "ahlan" (welcome) is spelled correctly but left to right instead of right to left (should look like this: أهلا)

32 minutes ago by callamdelaney

Hopefully they license it to Tesla so that my car stops misinterpreting 'call Jodie' as 'cool jodi'.

5 hours ago by yboris

Quick summary via a Tweet: https://twitter.com/nasim_rahaman/status/1395812572037795847

2 hours ago by Causality1

Will Facebook license this tech out? I can't be the only one who's noticed Google's speech recognition has gotten significantly less accurate the last couple of years, probably the result of some cost-cutting strategy.

an hour ago by simonw

They have an open source repo linked from the article here: https://github.com/pytorch/fairseq/tree/master/examples/wav2...

an hour ago by m463

I wonder if it's less accurate for specific people, but more accurate generally. In other words speech recognition was more accurate in the beginning on english-speaking men, and maybe now it's better for women, kids, people with accents, other languages, etc...

They train this on examples of speaking and maybe it's more broad now.

Daily digest email

Get a daily email with the the top stories from Hacker News. No spam, unsubscribe at any time.

High-performance speech recognition with no supervision at all