This is an interview recorded at the O'Reilly Web 2.0 Summit in November of 2006 between Robert Scoble and Gary Flake, founder of Microsoft Live Labs. Since the original site and video at PodTech have fallen into disrepair as of late, I asked Robert if he would grant me permission to reupload his interview here and he gave me the all clear.

Robert's original description:

Gary Flake, distinguished engineer at Microsoft, gives us a tour around a new 3D photo experience that amazed attendees at the O’Reilly Web 2.0 Summit yesterday when it was demonstrated. Several people came up to me afterward and said it was the coolest thing they had seen all week.

Transcript:

Guest: Gary Flake - Microsoft
Host: Robert Scoble - The ScobleShow

Robert Scoble - The ScobleShow
So, who are you?

Gary Flake - Microsoft
My name is Gary Flake. I'm a technical fellow at Microsoft and, uh, founder and head of 'Live Labs'.

Robert Scoble - The ScobleShow
'Windows Live.'

Gary Flake - Microsoft
No, no, no, no, no, no, no. 'Live Labs.'

Robert Scoble - The ScobleShow
'Live Labs'?

Gary Flake - Microsoft
You added the 'Windows'; I didn't. It's 'Live Labs'.

Robert Scoble - The ScobleShow
(laughs) Alright.

Gary Flake - Microsoft
Yeah, it's just Microsoft Live Labs.

Robert Scoble - The ScobleShow
So, you... you had the hit of the O'Reilly Web 2.0 Summit conference this week.

Gary Flake - Microsoft
I... Well, I'm glad to hear you say that!

Robert Scoble - The ScobleShow
Well, I arrived this morning and [something] like ten people came up to me [and said], "Did you see Photosynth?", so...

Gary Flake - Microsoft
That's good; I am glad to hear that!

Robert Scoble - The ScobleShow
... So, that's from Live Labs?

Gary Flake - Microsoft
Yup, from Live Labs.

Robert Scoble - The ScobleShow
Okay... and you're going to show it to me?

Gary Flake - Microsoft
Yeah! Yeah, so, it's right here.

So, let me, let me... before we dive into this, let me give you a little bit of the back story. The project is actually the marriage of two different technologies that, uh, were developed independently. One was this Seadragon technology, which came from a company that I acquired back in January. What Seadragon does is, it's basically a client-server technology that allows for very efficient streaming of high-resolution, big, big chunks of data, but in a very efficient manner. That's one piece.

The other piece was a breakthrough piece of machine vision research that was done in collaboration between University of Washington and MSR. And the researchers there: one was a graduate student, by the name, Noah Snavely; his professor was Steve Seitz, and our researcher within MSR is (I'm blanking on people's names now) Rick Szeliski. Thank you. Anyhow, when we saw what that research team had done in terms of this project that they called 'Photo Tourism', it was doing some very beautiful things in terms of stitching photos together automatically and spatially relating them to one another.

What really got us exciting was -- excited was, when we thought about combining that to the Seadragon Technology in combination, it came together as sort of a complete solution for... "How would you make this as a web service, where you had gigapixels of data that was remote on the server, but you want to give a very fluid environment?".

So, I'm actually, I'm going to start off in a home position right now.

Robert Scoble - The Scoble Show
Let me just set down my tripod, so, I can, uh...

Gary Flake - Microsoft
Yeah... keep it steady.

Robert Scoble - The ScobleShow
Get a nice steady image for you. Yeah. Actually, hold on just a second.-

I'm ready now.

Gary Flake - Microsoft
So, what we're going to do is we're going to fly around first, just to kind of give you an overall perspective. And, what you see here is a point cloud that is constructed by taking a look at all of the photos and finding out what points that they have in common. So, that's where the point cloud comes from. I'll do that one more time, just so that you can get the bird's eye view, I'll pause it. You could see that it constructs a rough approximation of a 3D model that stays. Now, we can continue flying down and drop ourselves into the middle here and basically navigate this space. Now, I'm just going to pick an arbitrary spot here and start walking around the square a little bit. And in fact, I can...

Robert Scoble - The ScobleShow
Now, wait a second. This is taken from one image!?

Gary Flake - Microsoft
No, no, no; I'm sorry. These are hundreds of photos that were taken by a photographer...

Robert Scoble - The ScobleShow
Alright.

Gary Flake - Microsoft
...and, what we've done is using that back-end technology, that was built by University of Washington and Microsoft Research, we figured out how all those different photos spatially relate to one another to effectively create a 3D model of the world, and then, how those photos actually relate to that 3D model. So, you're looking at the point cloud, which represents the 3D model, and then now you're looking at individual photo, and if I want to...

Robert Scoble - The ScobleShow
And, how many photos... if I wanted to go out to the Golden Gate Bridge and do one of these or something like that, how many photos do you need to really do this well?

Gary Flake - Microsoft
Well, right now, I... it depends on how big the space is. Right now we're looking at a rather big space and we've probably on the order of about 200 photos right here, I am guessing. And so, the larger the space, and more photos that you need; however, you know, dozens to hundreds, to thousands of photos actually do nice things, and so....

Robert Scoble - The ScobleShow
And how do these get in? Did somebody upload them with your tool?

Gary Flake - Microsoft
No, so it's actually lighter weight than that. So, one of our program managers went on vacation. He took a whole bunch of photos. We dropped that into our servers to allow the algorithms to crunch on them and that is the step where it figures out how they spatially relate to one another, the point cloud is one of the outputs of that process. How the photos, physically relate to that point cloud, is another output. And, it's from that, that's all the information we need, in order to reconstruct this 3D environment.

Robert Scoble - The ScobleShow
So, how long did it take to put 200 photos in, like this guy did? What do you guys think? You guys have estimates?

Gary Flake - Microsoft
It's a long time, I don't know whether I want into the details. It's about eight to ten hours of CPU time.

Robert Scoble - The ScobleShow
Eight to ten hours. Okay.

Gary Flake - Microsoft
Yeah. So, it's currently... it can be an expensive operation. We know that there are lots of ways of actually speeding it up and doing much faster things with it.

So, I'm just going to, kind of, walk around a little bit here, and we can take a look, and for all these things -- remember I mentioned the Seadragon technology before which is about really efficiently transmitting big objects over tight pipes, well, you are seeing it right here because we can zoom in on and get a whole lot of detail, which is pretty fun.

We can continue to step back, and step back, and step back, and, you know, so this gives a real immersive experience with respect to exploring a space. And, just maintaining that visual continuity, um, has a really nice effect of... of... of helping the user to maintain context where they are. The context switches are no longer hard; they're actually quite soft. So, take a look at this, I'm just going to tour around here, walk around, I can go back, I can occasionally step back, do a whole bunch of things like that.

We call this thing right here splatter mode, where, basically, when we're going to splatter mode, it takes the current image that you're on in 3D setting, puts them into the center and are ordered around the spiraling out from the center, are photos that are visually related to that one. So, you can see we have a whole bunch of photos here, that sort of capture part of that square and we can pick a different one and see a different perspective. For any of these photos, we can do that nice Seadragon-ized thing that I showed you before... we're basically - now we're zooming in, and getting a lot of data all at once. Again, these are like 8 megapixel images. Now we can go in, we can go out...

Robert Scoble - The ScobleShow
And are these coming over the wire live or are these locally stored?

Gary Flake - Microsoft
For the purpose of this kiosk, we set this up so it's a local store, because we're sharing bandwidth with lot of other people, but the experience that you just saw, uh, can be had by anyone using a broadband connection. So, I'm not showing you anything that isn't available today just by having a broadband connection.

Robert Scoble - The ScobleShow
Wow, and I can download this now, right.

Gary Flake - Microsoft
You can download this now, so if you go to labs.live.com/photosynth, and that's where the ActiveX download is. (Note: This site was discontinued when Photosynth was released from Live Labs in 2008. Please see photosynth.net today.)

Robert Scoble - The ScobleShow
Can I upload my own photos to it?

Gary Flake - Microsoft
We're not there yet and the...

Robert Scoble - The ScobleShow
'Cause, it takes eight hours of processor time...

Gary Flake - Microsoft
Yeah, what we're working on right now is a number of things.

One, is a longer term store, in terms of, how we want this to overlap with lots of different products, and what's the true potential of this here because, you know, whenever people look at this, automatically, there is a whole bunch of use cases that they come up with. I see this, and I want to take my vacation photos, and I want to see, how they relate to everyone's vacation photos. Someone else looks at this and they say, "Oh my gosh!, it's a new way of selling real estate." Someone else says, "I want to take the photos of my child, as they've grown up, and watch them grow up in my house."

So, there are so many different ways of how this could evolve, that instead of taking the shortest path, to like the quickest product, we really want to take a slightly more thoughtful path to a better product.

Robert Scoble - The ScobleShow
Okay.

Gary Flake - Microsoft
So, that's our approach; so clearly, what's in the cards is, we had to figure out, how do we make the community of photos available for many people, and I think, that's when a lot of the real mind boggling potential of this will come through. So, I'm also, what I'm going to do right now is, I'm going to show you a different collection

Robert Scoble - The ScobleShow
Okay.

Gary Flake - Microsoft
So, right now, we're diving into the artist studio of a guy by the name of Gary Faigin, and he's an artist local to Seattle, and he was kind enough to let us just come in to his studio and watch him work. And so, we can take a look at his studio, walk around, see what's going on. There all sorts of things going on, so I'm just kind of taking a gander of what's going on in the studio, and we've come full circle now.

But, now I'm going to turn back and notice, that as I move the mouse over, this indicates that some of these photos we actually have, at much higher resolution; that they are in fact registered in the same process. So, I can take a look at this, step back or go back in, and then zoom in and get just an amazing amount of rich detail.

Robert Scoble - The ScobleShow
Wow!

Gary Flake - Microsoft
And so, again, wait... someone, tell me how big are these... are these scans of this photos?

Jonathan Dughi - Microsoft
80 megapixels.

Gary Flake - Microsoft
These are...

Robert Scoble - The ScobleShow
80 megapixels!

Gary Flake - Microsoft
These are... Yeah! You're looking at 80...

Robert Scoble - The ScobleShow
Now, who are these guys behind you?

Gary Flake - Microsoft
Uh, these guys are member... (Why don't you introduce yourselves?) ...members of Live Labs.

Robert Scoble - The ScobleShow
Well, they can't hear because... You've got to introduce them because you've got the mic.

Gary Flake - Microsoft
Aww, shit! We have Adam Sheppard here; he is a group program manager that heads up a lot of product planning and program management within Live Labs. We've Jonathan Dughi who is a program manager working specifically on Photosynth. We have Blaise Agüera y Arcas who is an architect within Live Labs.

Robert Scoble - The ScobleShow
Excellent.

Gary Flake - Microsoft
Thank you.

Robert Scoble - The ScobleShow
Thanks. Awesome stuff, guys! This rocks!

Gary Flake - Microsoft
Yeah, so you're looking now at a 80 megapixel image, streamed over. We can now step back and see it in its context again. I'm going to zoom back in just for a second, and we can continue to walk around, and occasionally and we might say, oh, here's another, you know, photo over here that I want to dive into, and again, 80... 80 megapixels.

And so, now, you're getting the idea of that. Okay, we can do this for an artist studio. Imagine, if we did this for a major museum, what sort of experience could you have walking through the museum, seeing the 100 megapixel version of something that you could only really get the... a really suitable experience, if it was done in person, but now, I can look at 80 megapixel objects over my DSL line.

Robert Scoble - The ScobleShow
Right.

Gary Flake - Microsoft
That sort of thing. And again, there's just really, really rich detail throughout this whole collection, and a sense of spatial continuity, um, that's preserved. I'm going to go back to... to where Gary's at... Gary Faigin, and see, if I can navigate a little bit closer, and see what's going on with him. And here, we see a little bit of what's going on as he's working, and this is really neat, because you start to see things that look a little bit like time-lapse photography, because he's sitting there working and painting in his art wok, and each time we go through a transition, again, we're preserving the spatial continuity of all the different photos.

So for fun, if you're interested, I'm going to do a couple of more collections. This one is St. Peter's Vatican, and this collection is really nice, because from this collection we can get a sense of depth, that doesn't really exist in the other collections. So, we started from afar, we work our way in, we're going further, further, and so now you're seeing that you can capture orders of magnitude of depth within one experience, and navigate through that experience in a fairly seamless way that's incredibly compelling in terms of, how you can connect the whole thing together.

So, now I'm going to do something that's really unusual here, and we're taking a little bit of a 3D, I'm sorry, a 360 degree tour of this. I'm now going to fly above, see what's going on here. I can take a look at what's on these different pieces,

I'm going to keep moving, I'm looking for something special here, hold on. Okay, I want to get to the other side. Alright. Alright.

So, we have a little feature here that we call frusta, where it allows you to see where the photographers were standing when they took the photo, and what... what photo they were taking of. So you see here, I highlight this, and I'm seeing where their camera was pointed, and what they're taking the photo of. Well notice up over here, and over here, there's some people that... some shots that were taken from up high. So, we've now navigated this high... this space, like taking a bird's eye view stepping back, seeing that there's a photographer up there, and now I'm going to click on this, and get that view of above. Now, I'm going to turn off the frusta by clicking on this camera icon, and now we can just see, you know, some of these other images that are, you know, from... from the higher level view.

So one more collection I want to show, and I think this will kind of do it all justice.

Robert Scoble - The ScobleShow
Now, can I get to all these collections online when I download this?

Gary Flake - Microsoft
We've put four of them up initially, so, um, this one is not available, but I believe the other ones that I shown are.

Robert Scoble - The ScobleShow
Okay.

Gary Flake - Microsoft
So this is Grassi Lake, and what's really... actually let me... I want to get to a particular spot here, so I'm looking, I want to see these climbers in action. I'm going to go ahead and pick this one, and then I'm going to go into 3D view, which I'm in right now. Now what's really interesting is, again you have this sense of time lapse photography, because you could see the progress of the climbers.

Now, even though the camera has moved a little bit, we're preserving the spatial continuity in terms of what the climbers were going through, and now we can actually take a longer, larger view of what it was they were climbing, and how it's spatially it relates to the rest of the world. So again we're given entirely new experiences to what's possible with respect to photos, and being able to see that context more seamlessly and more continuously.

So you've been suspiciously silent, what does that mean, is that bad or good?

Robert Scoble - The ScobleShow
(Laughs) Stunned!

Gary Flake - Microsoft
Oh good, I'm glad to hear that, and again...

Robert Scoble - The ScobleShow
(Laughing) You stunned me into silence!

Gary Flake - Microsoft
That's great, that's good, I've never known you to be silent, so this is... I consider this, I don't know, an accomplishment.

But, anyhow, you see what's going on here, there's just... again, lots of rich information, lots of fun. We think that there's a lot of potential here, we obviously haven't thought through all the different use cases, and the reason why we put this out seven months after starting it was, because we wanted to actually start something -- the dialogue between our team and the rest of world to figure out what were the more compelling things to do. So... There you have it!

Robert Scoble - The ScobleShow
Very cool.

Gary Flake - Microsoft
Thank you.

Robert Scoble - The ScobleShow
Very cool guys. This rocks.

Gary Flake - Microsoft
Thanks.

Loading more stuff…

Hmm…it looks like things are taking a while to load. Try again?

Loading videos…