With this, I hand over to the third speaker, who's there, so I'd like you to welcome Professor Dr. Paul Lukowicz. He is Scientific Director of the German Research Center for Artificial Intelligence or the DFKI for those who speak German. He's been here last year and it was a great presentation. I'm really looking forward to today's presentation as well.
So, please welcome Dr. Paul Lukowicz and I think he will do the introduction in more detail for himself.
So, please welcome him. Yeah, thank you.
So, session will be risks and opportunities. So, about some talk about risk, I want to talk about an opportunity where generative AI can change things in wearable devices.
To myself, so I lead the Embedded Intelligence Department with a group of people, postdocs and some 15 PhDs, who work at different things like hardware and sensing, smart healthcare, machine learning, social computing, around the combination of essentially AI and hardware, mobile and ubiquitous devices, doing some quantum computing as well. So, it's a pretty broad area.
Now, if you look at wearable computing and wearable systems, right, it's actually something that's pretty old. It goes back to the late 90s at MIT and some of you remember, right, in the late 90s, the motion of a mobile computer was something that a fairly strong person was able to carry, right, and not something like a smartphone. And at this age, people were hanging all sorts of electronics and devices all over them and trying to use them while walking.
So, it was more like a very, very nerdy thing, you know, since me being younger, we're doing some strange jacket. It was actually a pretty cool thing.
So, we were building, it was back in the year 2000, as sort of a phone-containing jacket. And the thing was really nice, except that we had raw lithium-ion batteries in it.
So, where you did something wrong, this thing would tend to catch fire. And I was actually able to fly with the thing back then.
I mean, you imagine, those were the times I was on an airplane. I had this thing with lithium-ion batteries that would be tending to catch fire. I would pass through security.
So, different times, different risks, right? And, you know, from this very nerdy, and there's one other thing you used to work that back then, so that was again 2000, we were working on the watch. And if you look at it, something was like that big that you take like your pulse and your blood pressure and something, right? That was a science then. And obviously, you can buy those things.
And, you know, from these nerdy things to mainstream wearable computing, everything started with the iPhone, 2007, which we're just now, you know, 20 years ago. And although you would not call the iPhone like a wearable device for a lot of people, it actually brought connectivity as a degree of intelligence and the ability to interact with the user to you on the go, right? You went away from the suitcase you were carrying as a computer to something online. And something like 10 years, 10, 13 years later, the Apple brought out the Apple Watch.
And the interesting thing is in the beginning, people were saying, now I would guess three quarters of people wearing a watch wear a smartwatch, right? Some people do still like these old mechanical switch watches or something, but that's something that got.
And, you know, then close to that was the earbuds. And if you look at it, you look at the younger people, this is very much the current standard, let's say, outfit, right?
You know, people say that the young, my children always walk around with their earbuds and they have a smartwatch and a smartphone and this sort of wearable stuff is mainstream, right? This year, Samsung announced a smart drink. Let's see how far that goes.
But, you know, that's a fourth device. And there's, of course, you know, a whole score of devices around it. You can buy shoes with sensors if you're really like deeply into running. And that was something we did a couple of years ago with the telecom. It was like a wearable computing fashion show. This lady actually has a dress that has like drones that could land on the dress and then start and do a performance. So a lot of stuff that you could do.
And of course, you know, technology came around with these things we were working on back then, you know, which you can now buy in the watch and it works. Now, the interesting thing is if you look at the market potential of it, it's pretty big. So this is an estimate of the market of wearable devices and 2024 with something like 80 billion globally. And everybody's talking about generative AI, but if you essentially compare it to generative AI, actually now the wearable device market is larger.
And as people estimate the generative AI market to grow faster, it's still in a similar order of magnitude, right? So while, you know, everybody's talking about AI, and there's a reason for it. You build hardware, it's higher value, it's higher market. And it's something everybody is having. And the argument I'd like to make is that actually the real potential beyond both of those two is sort of in the integration of those two technologies, right?
You bring generative AI together not just with a network you can hack in your company, but your devices, you have a new body so that they hack your life and hack you and do all the unethical stuff we were talking about before on the go, right? So why is it so?
Now, if you look what the tech companies are pushing now as the next generation, so there's a lot of talk about this smart glasses and then this display. Two years ago, Apple came out with the Vision Pro and it's been very hyped. It's totally cool. We have one in our lab, of course, but nobody's really using it. And that's not a big selling project. And Meta is also pushing all sorts of augmented reality. You can actually buy Ray-Ban glasses with some sound and cameras. And that's sort of the next generation where they look at.
Now, why are those things not selling well? And why do I think they have potential?
Well, why they're not selling well is pretty cool. I mean, the scene here, right, you're sitting on a sofa. Instead of watching it on TV, you can watch a thing in your glasses.
Yeah, in some situations it sort of has something, but it's not something that's like a big game changer, right? Not something that really makes sense. And to understand what makes sense and why, you have to look at the way computers and interaction with computers has been evolving, right? So you go back to the old days. My source PC, you would go and go online with a PC, say, once a day, right? Then you had notebooks, and it's something you would do a couple of times a day. And if you send an email now to someone, you're not going to get an answer within two or three hours.
You start worrying, right? So that's the way. Then you get smartphones. And there are statistics that actually people use a smartphone about 150 times a day.
Now, try to think of it. Not all of you. I'm also guilty. My children certainly are. But the question is, think of anything else in life you do 150 times a day. The only thing I can think of is breathing, right? So that's actually something that's very, very strong. And if you look at smartwatches, their value of smartwatches is that, in a way, it gives you access to the digital domain even more often. What happens is if I take out my smartphone here and check my messages, it's rude, right? I do something like this, it's fine, right? So I can watch and access information more often.
And then when things like smartglasses, like Google Glass, come, it's something you could do continuously, right? You can talk to you. I can look at this. This is my son looking at my Google Glass. I got a device a couple of years ago. It's a fun, geeky thing to play with, but totally useless, as it is. Except for my application.
And, you know, if you then look at what Meta are trying to do, it's taken even further. So the vision behind it really is that this division, like going online, how often do you go online, just stops being a topic. You simply permanently live in a merger of digital and online and offline information. So that is what they are pushing for with the technology.
So, you know, something like you walk around and then you permanently have this digital tag assigned or, you know, you see people's tweets. I look at you and I see your LinkedIn profile, your latest posts, and things like that. And first of all, it sounds strange. The positive thing is look around the room. How many signs and other things do you have here, right? And you're fine. So if I'm able to put the same signs in your view that are not the same for everyone, but everyone has a different one, there is something to it.
The problem with the whole thing doesn't fly is that current technology, you have two possibilities. One is that the system just goes and plasters everything in an environment with some sort of information, right? That's what you see with your phone. You get all sorts of advertisements. It's just obtrusive. It doesn't make sense.
You know, you look at this picture here, right? You don't really want to see something like this. The other possibility is that you consciously ask. I'm here. I'd like to see, you know, who you are. Look at your LinkedIn profile while I talk to you. But then I would have to, you know, go and somehow even with virtual keyboard or whatever, right? It's not about merging. Once I start interacting with a system, I cannot interact with reality. And that's why those glasses do not fly, because this notion of this merger of the two realities doesn't work.
So what you need are applications like this, right? You do something that is dangerous. You take your system like here, you know, and actually something could happen. In Kaiserslautern, we hear this thing with somebody actually blew up his car because the tank was frozen and they used a lighter to unfreeze it, right?
So, you know, you like your system to, you know, compensate for people's stupidity. And you can do other things like, you know, I go somewhere and I don't want to be browsing it.
I say, you know, what's interesting in this building, right? Why is this guy interesting to me? This sort of natural interaction. That's what you would like to do with a system for the system to be really useful.
You know, I call the type of systems you're looking at sort of the what now system, right? Instead of browsing something, you know, you're being chased by the bear, right? Typically taking a smartphone and trying to Google what do I do when a bear is chasing me is a suboptimal approach, right? But something like, what do I do now, right? And then you get something. It's what helps you. And in the other situations, you know, I'm somewhere at a train station. Where do I go now? I've been to China and, you know, getting around Chinese train station is a real challenge.
And then you take your phone and you look at it just like your system, just to, you ask you, what do I do? And you like the system to help you. And the problem, of course, is that if you look at today's systems, the stuff, what you need to make them possible, right? Is you need a system to provide you a very complex interpretation of the real world, which they cannot do. And that is precisely where generative AI comes into play, right? And let me just go excurse to generative AI and slightly different view on it from my previous speakers. It's very interesting the way we're looking at it.
So first of all, people now talk about LLMs and say about a lot of things. Like they say LLMs can reason, talk about them becoming conscious and whenever being replacing T-shirts. Now my view of this, that is all rubbish, right? That's not something they can do. So what actually are LLMs in this case? Let me give you my view. And it's often coming from science.
You know, I go back to old Greek philosophers and it is very beautiful allergism from old philosophers that says the way humans perceive the world is like this prisoner sitting in a cave and seeing reflections on the wall, right? Which says, you know, what we see is just a very much filtered and skewed reality. Now people invented language to describe this skewed perception of the world. Then they digitized whatever they produced as well as thought about the world, right? Put it in books, we digitized them.
Then we applied neural networks to that and created a model of a distribution of essentially lexical tokens in this huge amount of stuff that people produced. And that's what LLMs are. Now the first thing that you come up with, why the heck does it do anything useful? And that is something where the AI community was really surprised by. LLMs are something people have been working on language models probabilistic for a long time. But when chatGPT came out, all of us were surprised that it was so powerful.
But of course, once you think about it, it's very clear because the texts are reflections of the people's perceptions of reality. And since it has a huge amount of texts, then these systems are something like approximate, a very noisy world model based on the experience of humanity as a whole. And that's why they can do all these amazing things. You're asking something, it produces something because it has access to everything that people ever put online. Some of them is nonsense, some of them is good, but that's what they have.
And also the other thing to remember is they do not understand all these things, right? They access the thing based on probability distribution of words. And this is this very, very ambivalent thing about this stuff being able to do amazing stuff and being just stupid. And you know, the new models is today we don't just post texts, we post messages, right? So we combine videos and text. And then you have like this vision models where essentially the models go and relate what you have online as a text to what you would have online as images, right? And then you can do abstract representation.
So where does it bring you to? Now look at the applications we had. Like this thing, what can I do here? If you give a model something like a picture like this and ask them, what can an art lover do with it? You get a really good answer, right? You walk around and say, why is it interesting for me?
You know, you can go further. You show him something like this, this person, it's not a gas tank, but you know, working with a lighter to unfreeze the lock. You ask him what is wrong with this picture and you actually get a good answer, right?
And again, the reason it works is because you have a lot of these memes on the internet that sort of relates to something like this. But the point is it doesn't just do image recognition, right? It does something that is a quite reasonable interpretation of the image with respect to what a human would think about that.
And, you know, you can ask the system, you know, what are safe methods to actually do it? And it gives you the information, right? And if you think about using these wearable glasses, right, then these things become useful because it just goes around, you see something, you see a lock and say, what do I do now? And then you get immediately on your display what you can do, and maybe you can even immediately order from Amazon the right thing that you could use, right? And then this thing is suddenly useful.
And, you know, you can do other things. Something like this, what problem does this person have? And the system goes on and really understands that you're probably building an IKEA piece of furniture and it could even, you know, use a display to answer to you. So the point is then you suddenly have this ability to be in the real world, say things, gesture things, show the system a picture, and the system, I don't like to say it understands, right, but it has sort of a hypothesis of what may be useful and is able to deliver that information, right?
And, you know, coming back now to this thing with the beer, what should a person in this picture do? Again, asking, asking chat GPT. You get a reasonable answer. Probably better than nothing. You shouldn't trust your life to an LLM, right? So it's not always useful, but it's probably better than nothing. And obviously if you, you know, if you look at a situation where, for example, you can just, you know, tap of your ear pad and ask the system, what should I do now? Then it's actually viable in nearly every situation.
Whereas going there and, you know, browsing your smartphone for the information while the beer is chasing you is not a good strategy, right? So it's sort of this assistant thing that really helps a lot.
And, you know, if you build on it, what you can imagine is even more, right? So you're running there. You don't even see the beer, but you have your smart glasses with your camera. You are connected to your, you know, IMU on your smartwatch, to your pulse measurement. So the system can actually really easily, you know, just translate that information to a query, you know, in the LLM that can assess the situation and actually tell you and see, you know, there's maybe something wrong with the beer behind you. Don't be scared. Just slow down and things like that.
And the point is, it's not as far-fetched. You can actually play those games with the LLM and such thing. And always remembering that it's not an expert who has an understanding of anything, but in many cases can retrieve stuff that is sort of useful. And coming back, you know, to the wearable technology, what you now see is that basing on the technology, suddenly this notion of having all of the devices with you give you a totally different value.
You know, currently having this sort of smart glasses or even many other devices has limited value because you walk around the world, you don't want to be accessing this thing all the time. But this notion of merging the real and, you know, the virtual world, right, is really something that now becomes obvious.
And, you know, we think, in case you think it's science fiction, let me see if I can get some. Actually, it doesn't need some. Here's a demo from Google Gemini. I don't know if somebody, some of you know it. So what they have, they install that, you know, experimental version of it on a phone, right? And what you see is the person walking around with a phone. And so you could just ask me, tell me if you see something that makes sound. And the thing sees the loudspeaker and identifies it's something correct, right?
Now translate it with you wearing smart glasses and having a camera that does that. Right? And today you just work with a phone, which you wouldn't do. But then I just walk down, do like this with my finger, and I get the information. Right? And it can provide you. And so you just walk around, walk around the world and get information.
And again, once you translate it into some sort of a wearable hardware, suddenly it's this wearable AI thing that really transforms the way you interact with reality. And it just does anything. I don't know, there's some other things with this.
So, you know, in summary, what I want to show you is that if you look at this generative AI and wearable technology, and you have convergence essentially of two technologies that have potential to create totally different applications, I actually believe that the impact of this generative AI on all systems where humans or systems interact with the real world is much more profound than you see currently. I use generative AI to generate most of my pictures, you do it to write text, you search your databases.
But changing the way systems interact with reality will have an even more profound impact and creating all sorts of security and other problems. Due to our hiccup, we have a few minutes left for questions. Are there any questions?
First, this was fantastic. Thank you. The odd question, but have you ever seen the TV show Voltron Legendary Defender?
No, no. So in this short version, in this, people are piloting spaceships, and it's sort of a running joke among the fandom that there's no problem that can't be solved by shoving the controls all the way forward and yelling. And it's a little silly if you think about actually flying, but what is really happening if you get into, like, sort of the lore of the show is that there's actually, it's an intentional interface. So the pilot is basically saying, I really, really need to do the thing that I'm thinking about right now. This is important.
And so shoving the thing forward and yelling actually is communicating something. What you're talking about feels to me the first steps in that. Do you think that that is a space that we're going where computers can do what I mean, and we just need to kind of nudge them, or is that still a long ways off? So I think in a way you're going in this direction, but it's not as much as computer can do what you mean. It's just that making computer do something approximately what you would like them to do, it gets easier, right?
It's this thing that, you know, as I said, instead of having to explicitly specify everything, you can say something and you leave the rest for the system's interpretation. And as everything in life, people talk about risk and stuff, and my favorite proverb is there's no such thing as a free lunch, which essentially means using the systems can make things extremely easy, but it opens space for spectacular failures. Any further questions? Now you have the chance. So the next way we will be communicating with our assistants, what do you think it will look like? Questions?
I really like the way that you said that it's really augmenting you while you're doing something. So what do I do next?
This, what to do now? I think I really like that because running around with a mobile phone and running into people on a train station is not the way to move forward, and I think that is a nice approach to really have the system understand what you're doing right now. Final chance? Okay.
I mean, when I think about this world, I also, of course, have to think about security implications of it. So what if someone would hack into the system and present different information or force different actions? So obviously the more you use the systems for, the more you open doors to misuse. I think the question again is what is the risk? So if that's like my personal system I use for everyday assistance, the most likely consequence is that I will somehow make a fool of myself or something like this. When you come to self-driving cars, for example, it's very different.
But also what I think people tend to overestimate those risks. In Germany, you're very often in these cases where people get killed because some stupid kids or malicious people throw rocks off the bridge over the autobahn. So you don't need AI to be really malicious. It's sort of not true because the risk with AI is that sometimes one malicious act can simply impact a huge amount of people, right? And that is the problem. But that's also the problem where you actually, I think, need to put in the safeguards.
It's not as much about the system itself, but it's looking, you know, assuming the worst thing, and the worst thing is this thing will always be hackable. The question is how can I make sure that they are not by themselves able to do too much harm by having authorized control of too many things? Great. Thank you very much, Paul.