Saturday, June 12, 2010

Digital Fingerprint

Due to some lawsuits by content copyright holders, YouTube (Google) checks for copyrighted audio/video each time you upload something. I think the process (Audio ID and Video ID) uses a digital fingerprint, which is a highly compressed/lossy version of the copyrighted data, and each upload is compared to a database of digital fingerprints. The technology is pretty cool and fairly robust, as I found out.

I uploaded an episode of Let's Go Dream Team (season 2 episode 29) from KBS World. The entire show was a little over an hour so I split it into 6 parts for YouTube upload. The original file was ~728MB (640x480, 1500kbps MPEG4 video, 128kbps AAC audio) so each segment was ~120MB. All of them uploaded okay except part 2, which triggered a copyright content match, and was blocked by YouTube. The first thing I tried was to split the 10 minute file into 2 minute segments to see where the digital fingerprint was being compared... however, all 5 uploads triggered the content block. Next I tried the following on the first 2 minute segment: Gaussian blur, remove chroma (black and white), logo removal (major blurring) on different parts of the frame, remove audio, various encoding codecs, and resizing/adding borders to no avail. I finally chose a 2x2 mosaic filter which probably altered the file so much that it fooled the fingerprint process.


Part 1


Part 2

The mosaic filter basically shrinks the video 2x to 320x240. It also shows 4 consecutive video frames, i.e. the 4 pictures are not exactly the same), which is probably what throws off the video fingerprint.

This episode is a competition between a female Dream Team (mostly idol singers) versus female KBS announcers. Out of the 12, I think Oh Jeongyeon (announcer) is the prettiest.



She was also a MC on Star Golden Bell for awhile until she got married to a Korean basketball player.

No comments: