Bart Thomee of Yahoo Labs discusses "YFCC100M: The New Data in Multimedia Research" (cacm.acm.org/magazines/2016/2/197425), Contributed Article in the February 2016 CACM.
---
TRANSCRIPT
00:00 The more you see, the more you know. You learn that this is a bird... and so is this... until by example you simply come to understand.
00:15 Recent advances in artificial intelligence let computers learn this way, too. The minds of today's AI also hunger for stimulus. And now their appetites are growing beyond what yesterday's datasets could provide.
00:30 Join us as Bart Thomee introduces the biggest dataset of photos and videos yet, in YFCC100M: The New Data in Multimedia Research.
00:43 [Intro graphics/music]
00:53 Computer vision researchers in the 1990s realized that their algorithms would need datasets of images to practice on. But times have changed, and today's researchers need even more.
01:07 DR. THOMEE: Problem with most datasets, unfortunately, is either they're images or they're video, but not multimodal, so to say, where you have all these different types of media. As well as, well, copyright and legal issues.
01:22 So Dr. Thomee and his colleagues collected one hundred million items -- approximately 99 million photos and 1 million videos -- from the library of the photo-sharing service Flickr, a Yahoo company. This is the Yahoo Flickr Creative Commons 100 Million, or YFCC100M for short. The collection is both broader and deeper than any that's come before it. Just as importantly, it comes with clear licensing guidelines.
01:51 DR. THOMEE: The YFCC100M is one of the exceptions where, since it's all Creative Commons, which some can actually be commercially used, it's open to anyone in the world to use it.
02:01 The team has made YFCC100M available for free in several ways. The basic metadata alone comprise twelve and a half gigabytes, with more metadata available through Flickr's API.
02:14 DR. THOMEE: So we've made the effort of giving people the static information -- the title, description, tags. If people want dynamic information that can change over time like: Which album was this photo? ... the API is perfect.
02:31 The 100 million medium-size items themselves come to a whopping 17 terabytes, with copies hosted at major cloud providers.
02:40 DR. THOMEE: They say, "I want all photos of sunsets taken in this bounding box". They could write a little script that does that and Amazon would give them back the results, just like our grid would return back the same results.
02:53 The researchers continue to enhance YFCC100M with expansion packs, such as one with automatic recognition of visual concepts. The research community at large has improved access, demonstrated applications, and added further features. These advances motivate Dr. Thomee, just as YFCC100M motivates others.
03:17 DR. THOMEE: My drive in this project was to have an equal footing for everyone. And at the same time of course I hope that, if I see other people do interesting things with it, it inspires me to use that to go further in another direction that I hadn't foreseen myself.
03:39 Find out more in the contributed article, "YFCC100M: The New Data in Multimedia Research", in the February 2016 issue of Communications of the ACM.
03:52 [Outro and credits]