Home > Authentication, Event Management, Intelligent Video, Software > Read My Lips! Don’t Talk About Terror

Read My Lips! Don’t Talk About Terror

Steve Watson posted on Infowars his concern about lip-reading surveillance cameras.  It was also mentioned on SlashDot. Steve pointed to several Orwellian technology deployments in London, including talking cameras, face scanning cameras, and eavesdropping cameras. Evidently there is some interest by the British Home Office to investigate developing the technology to watch for terror-related keywords uttered by the lips of passers by.

I love Steve’s commentary.  But what would it take to make that capability a reality? Let’s see, we’d need a software platform elegant enough to process a complex algorithm super fast.
Then we’d need the algorithm – capable of interpreting very subtle visual images and rendering the signals into sounds.
Then there would have to be separate sound analyzer software – unless the home office wanted to hire hundreds of people with a variety of language skills to monitor the output constantly.

It’s an interesting technical challenge. Of course, capturing video images with a clear enough resolution for
the analytics to work properly would require upgrading to high
megapixel images. Megapixel images would require that the digital images be sent to the processing engine over a very fat network connection. Even if one could produce the software and algorithm, the processing power required would be intensive to say the least, so pushing the processing out to the "edge" near the cameras would require even more science fiction and extraterrestrial financing. 

No…lip-reading cameras is nothing I’m worrying about for a long long time.

  1. May 3, 2007 at 1:05 pm

    Hi again Steve,
    Per usual you are right on here. As you know in my previous life I was the CEO of the US subsidiary of the company that implemented the automated license plate reading scheme deployed around the UK and as a machine vision company we developed automated identification and analysis systems for everything from currency inspections to measuring the length of french fries. We also put a facial recognition system into Heathrow Airport in the mid 1980’s.
    Any time you are trying to solve an “organic”, i.e. natural image recognition problem you have moved to the edge if not beyond the state of the art. Surely the application defined here is not one that would be taken on as the basis of a business. Sure if someone gave me a bucket of money I would mock up something that could do some of this. The one skill I did learn in the machine vision business was when to say no. If someone came to me and said is this something we could do with the best imaging sensors, hardware, algorithms, lightings schemes, etc., the answer would be a very easy no.
    To do this in a camera would be a non-starter since the cost would go through the roof, so again not likely to be able to address this with off-the-shelf cameras.
    Yeah the Brits are wacky when it comes to the amount of video cameras they have in place, but that doesn’t mean that suddenly there is an ability to learn any more from them than what you get from looking at a video monitor.
    Most of their uses of surveillance are forensic, that is looking for stuff after the fact. Some use cases exist if they get a tip to monitor a very specific area. Automating visual speach recognition, read my lips… NO.

  2. May 14, 2007 at 5:42 am

    I’m not a security expert and really only ended up on your blog to because a friend referred me to your review of the book on identity theft, but I continued reading other entries and saw this one.
    You might want to google “automated lip reading” and then reconsider whether or not there should be concern about this. At some point in the past week I watched the TV show that talked about how this technology was used to determine what Hitler was saying in home movies. Very interesting and it appears that the technology you’re not worried about is well on its way.

  3. Frank Yeh
    May 24, 2007 at 12:45 pm

    Normally I agree with your positions but in this case there are two considerations that seem to make your argument less compelling.
    First, we do not need software platforms to do this in real time. Anything that can be done in software can eventually make its way into hardware. Video analytic algorithms embedded in hardware are already on the market.
    Second, 100% accuracy in realtime is probably not doable and definitely not affordable with current technology, but 50% accuracy in a forensic fashion might be enough to justify the system.

  1. No trackbacks yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: