In this post we will talk about certain User Interface (UI) technological advances that we are observing at the moment. One such development was revealed in a recent media event conducted by Microsoft, where they announced the Microsoft HoloLens, a computing platform which achieves seamless connection between the digital and the physical world, quite similar to the experience referred to in certain movies in the past.
It is interesting to note that the design of the HoloLens device looks so similar to something we have seen before.
Even the vision of holographic computing and users interacting with such interfaces isn’t a new one. The 2002 movie “The first $20 million is always the hardest” was possibly the first time we saw how such a futuristic technology might look like.
How did we reach here? A brief discussion on UIs…
User interfaces have always been an important aspect of computers. In its early days computers had a monochromatic screen (or at-most a duo-chromatic screen). A user would type in commands into the screen and computers would execute said commands. Since the commands would be entered in a single or a series of lines, this interface was called the Command-Line Interface (CLI).
Command Line based UI
Such an interface was not particularly intuitive as you had to know the list of commands that would fulfill a certain task. Albeit a certain group of individuals i.e. geeks and some computer programmers, like me, prefer such an interface owing to its clean and distraction free nature. However, owing to the learning curve of CLIs, researchers at Stanford Research Institute and Xerox PARC research center invented a new User interface called the Graphical User Interface (GUI). There were a few variations of the GUIs for example the point and click type also known as WIMP (windows, icons, menus, pointer) UI created at the Xerox PARC research center and made popular by Apple through it’s Macintosh operating systems
Apple’s Macintosh UI
And also adopted by Microsoft in its Windows operating systems
Microsoft’s Windows UI
Some early versions even included a textual user interface with programs which had menus that could be parsed using a keyboard instead of a mouse.
Early textual menu based UI
Eventually new avenues were created for UI research. Continuing onwards from textual interfaces to the WIMP interfaces to the world wide web where objects on the web became entities accessible through a Uniform Resource Identifier (URI). Such an entity could possibly have Semantics associated with them too (as defined by Web 2.0). However, with the advent of mobile smart-phones we saw a completely different class of user interfaces. The touch-based user interfaces and its more evolved cousin the multi-touch systems which allowed gesture based interactions.
Touch and gesture based UI
This was the first time in computing history that humans were able to directly interact with an object on their device with their hands instead of using an input device. The experience was immersive but yet these objects had not entered into the real world. We were on precipice of a revolution in computing.
This revolution was the mainstream launch of Wearable Technology and Virtual/Augmented Reality and Optical Head Mounted Display devices with the creation of devices like the Oculus Rift, Google Glass and EyeTap among others. These devices allowed voice inputs and created a virtual or an augmented reality world for it’s user. Microsoft too was working on gesture based interactions with the Kinect device and research in the Natural User Interface (NUI) field. Couple of interesting works worthy of taking a look from this revolution are listed below.
This talk by John Underkoffler demos a UI that we saw in the movie Minority Report. He talks about the spatial aspect of how humans interact with their world and how computers might be able to help us better if we could do the same with our computers.
Here Pranav Mistry, currently the Head of the Think Tank Team and Director of Research of Samsung Research America, speaks of SixthSense. A new paradigm in computing that allows interaction between the real world and the digital world. All these works were knocking on the doors of a computer as we saw in the 2002 movie mentioned earlier, a real life holographic computer. Enter Microsoft HoloLens!
What is Microsoft HoloLens?
Microsoft HoloLens is an augmented reality computing platform. As per the review from Forbes.com this device has taken a step beyond current work by adding to the world around its user, virtual holograms, rather than putting the user in a completely virtual environment. This device has launched a new platform of software development, i.e. Holographic apps. As well as, the device has created a scope for hardware research and development, as it requires new components like the Holographic Processing Unit or HPU. Visualization and sharing of ideas and interaction with the real world can now be done as envisioned in the TED talk by Pranav Mistry. A more natural way of interacting with digital content as envisioned in the works above are a reality now. The device tracks its user’s movements in an environment. It detects what a person is looking at and transforms the visual field by overlaying 3D objects on top of that.
What kind of applications can we expect to be developed for HoloLens?
When the touch UI became a reality developers had to change the way they worked on software. Direct object interactions as shown above had to be programmed into their applications. Apps for HoloLens would similarly need to handle use-cases of interactions involving voice commands and gesture recognition. The common ideas and their corresponding research implication that come to mind include:
Looking up a grocery list when you enter the grocery store (context aware)
HoloLens Environment overlaid with lists
Recording important events automatically (context aware computing)
Recognizing people in a party (social media and privacy)
Taking down notes, writing emails using voice commands (natural language understanding)
Searching for “stuff” around us (nlp, data analytics, semantic web, context aware computing)
Playing 3D games (animation and graphics)
HoloLens Environment overlaid with 3D Games
Making sure your battery doesn’t run out (systems, hardware)
Virtual work environments (systems)
Virtual Work Environments through HoloLens
Teaching virtual classrooms (systems)
Why or how could it fail?
Are there any obvious pitfalls that we are not thinking about? We can be rest assured that researchers are already looking at ways this venture can fail and for Microsoft’s own good we can be certain they have a list of ways they think this might go and if there are any flaws they are surely working on fixing them. However, as a researcher in the mobile field with a bit of experience with the Google Glass, we can try to list some of the possible pitfalls of a AR/VR device. The HoloLens being a tetherless, Augmented Virtual Reality (AVR) device could possibly suffer from some of these pitfalls too. The reader should understand that we are not claiming any of the following to be scientifically provable because these are merely empirical observations.
The first thing that worried us while using the Google Glass was that it would sometimes cause us headaches after using it for couple of hours. We have not researched the implications of using the device by any other person so this is and observation from experience. Therefore one concern could be regarding the health impact on a human being with prolonged usage of an AVR device.
The second thing that was noticed with the Google Glass was how that the device heated up fast. We know from experience that computers do get hot. For example when we play a game they get hot or we do a lot of complex computations they get hot. An AVR device which is being used for playing games will most probably get hot too. At least the Google Glass did after recording a video. Here we are concerned about the heat dissipation and its health impact on the user.
The third observation that we made was that the Google Glass, showed significant sluggishness when it tried to accomplish computation heavy tasks. Will the HoloLens device be able to keep up with all the computations needed for, say, playing a 3D game?
The fourth concern is regarding battery capacity. The HoloLens is advertised as a device with no wires, cords or tethers. Anyone who has used a smartphone ever knows the issues of the battery on the devices running out within a day or even half a day. Will the HoloLens be able to carry a charge for long or will it require constant charging?
The fifth concern that we had was regarding privacy. The Google Glass has faced quite a few privacy concerns because it can readily take pictures using a simple voice command or even a non-verbal command like a ‘wink’. We have worked on this issue as part of our research product FaceBlock. Will the HoloLens create such concerns as this device too has front facing cameras that are capturing a user’s environment and projecting an augmented virtual world to the user.
The above lists of possible issues and probable application areas are not exhaustive in anyway. There will be numerous other scenarios and ways we can work on this new computing platform. There will probably be a multitude of issues with such a new and revolutionary platform. However, the hybrid of augmented and virtual reality has just started taking small steps now. With invention of devices like the Microsoft HoloLens, Google Glass, Oculus Rift, EyeTap etc. we can look forward to an exciting period in the future of Computing for Augmented Virtual Reality.
Roberto Yus, Primal Pappachan, Prajit Das, Tim Finin, Anupam Joshi, and Eduardo Mena, Semantics for Privacy and Shared Context, Workshop on Society, Privacy and the Semantic Web-Policy and Technology, held at Int. Semantic Web Conf., Oct. 2014.
Capturing, maintaining, and using context information helps mobile applications provide better services and generates data useful in specifying information sharing policies. Obtaining the full benefit of context information requires a rich and expressive representation that is grounded in shared semantic models. We summarize some of our past work on representing and using context models and briefly describe Triveni, a system for cross-device context discovery and enrichment. Triveni represents context in RDF and OWL and reasons over context models to infer additional information and detect and resolve ambiguities and inconsistencies. A unique feature, its ability to create and manage “contextual groups” of users in an environment, enables their members to share context information using wireless ad-hoc networks. Thus, it enriches the information about a user’s context by creating mobile ad hoc knowledge networks.
Community Health Workers (CHWs) act as liaisons between health-care providers and patients in underserved or un-served areas. However, the lack of information sharing and training support impedes the effectiveness of CHWs and their ability to correctly diagnose patients. In this paper, we propose and describe a system for mobile and wearable computing devices called Rafiki which assists CHWs in decision making and facilitates collaboration among them. Rafiki can infer possible diseases and treatments by representing the diseases, their symptoms, and patient context in OWL ontologies and by reasoning over this model. The use of semantic representation of data makes it easier to share knowledge related to disease, symptom, diagnosis guidelines, and patient demography, between various personnel involved in health-care (e.g., CHWs, patients, health-care providers). We describe the Rafiki system with the help of a motivating community health-care scenario and present an Android prototype for smart phones and Google Glass.
If you are a Google Glass user, you might have been greeted with concerned looks or raised eyebrows at public places. There has been a lot of chatter in the “interweb” regarding the loss of privacy that results from people taking your pictures with Glass without notice. Google Glass has simplified photography but as what happens with revolutionary technology people are worried about the potential misuse.
FaceBlock helps to protect the privacy of people around you by allowing them to specify whether or not to be included in your pictures. This new application developed by the joint collaboration between researchers from the Ebiquity Research Group at University of Maryland, Baltimore County and Distributed Information Systems (DIS) at University of Zaragoza (Spain), selectively obscures the face of the people in pictures taken by Google Glass.
Comfort at the cost of Privacy?
As the saying goes, “The best camera is the one that’s with you”. Google Glass suits this description as it is always available and can take a picture with a simple voice command (“Okay Glass, take a picture”). This allows users to capture spontaneous life moments effortlessly. On the flip side, this raises significant privacy concerns as pictures can taken without one’s consent. If one does not use this device responsibly, one risks being labelled a “Glasshole”. Quite recently, a Google Glass user was assaulted by the patrons who objected against her wearing the device inside the bar. The list of establishments which has banned Google Glass within their premises is growing day by day. The dos and donts for Glass users released by Google is a good first step but it doesn’t solve the problem of privacy violation.
Privacy-Aware pictures to the rescue
FaceBlock takes regular pictures taken by your smartphone or Google Glass as input and converts it into privacy-aware pictures. This output is generated by using a combination of Face Detection and Face Recognition algorithms. By using FaceBlock, a user can take a picture of herself and specify her policy/rule regarding pictures taken by others (in this case ‘obscure my face in pictures from strangers’). The application would automatically generate a face identifier for this picture. The identifier is a mathematical representation of the image. To learn more about the working on FaceBlock, you should watch the following video.
Using Bluetooth, FaceBlock can automatically detect and share this policy with Glass users near by. After receiving this face identifier from a nearby user, the following post processing steps happen on Glass as shown in the images.
What promises does it hold?
FaceBlock is a proof of concept implementation of a system that can create privacy-aware pictures using smart devices. The pervasiveness of privacy-aware pictures could be a right step towards balancing privacy needs and comfort afforded by technology. Thus, we can get the best out of Wearable Technology without being oblivious about the privacy of those around you.
FaceBlock is part of the efforts of Ebiquity and SID in building systems for preserving user privacy on mobile devices. For more details, visit http://face-block.me
Memoto is a $279 lifelogging camera takes a geotagged photo every 30 seconds, holds 6K photos, and runs for several days without recharging. The company producing Memoto is a Swedish company intially funded via kickstarter and expects to start shipping the wearable camera in April 2013. The company will also offer “safe and secure infinite photo storage at a flat monthly fee, which will always be a lot more affordable than hard drives.”
The lifelogging idea has been around for many years but has yet to become propular. One reason is privacy concerns. DARPA’s IPTO office, for example, started a LifeLog program in 2004 which was almost immediately canceled after criticism from civil libertarians concerning the privacy implications of the system.
UMBC CSEE department members submitted a number of #ifihadglass posts hoping to get an invitation to pre-order a Google Glass device. Several came from the UMBC Ebiquity Lab including this one that builds on our work with context-aware mobile phones.
Reports are that as many as 8,000 of the submitted ideas will be invited to the first round of pre-orders. To get a rough idea of our odds, I tried using Google and Bing searches to estimate the number of submissions. A general search for pages with the #ifihadglass tag returned 249K hits on Google. Of these 21K were from twitter and less than 4K from Google+. I’m not sure which of the twitter and Google+ posts get indexed and how long it takes, but I do know that our entry above did not show up in the results. Bing reported 171K results for a search on the hash tag, but our post was not among them. I tried the native search services on both Twitter and Google+, but these are oriented toward delivering a stream of new results and neither gives an estimate of the total number of results. I suppose one could do this for Twitter using their custom search API, but even then I am not sure how accurately one could estimate the total number of matching tweets.
Can anyone suggest how to easily estimate the number of #ifihadglass posts on twitter and Google+?
UK semantic technology company True Knowledge has released Evi, a mobile app that competes with Siri.
The mobile app is available on the Android Market and on iTunes. You can pose queries to either by speaking or typing. The Android app uses Google’s ASR speech technology and the iTunes app uses Nuance.
True Knowledge has been developing a natural answering question answering system since 2007. You can query the True Knowledge online via a Web interface. Tty the following links for some examples:
Yesterday I made a purchase at the CVS store on Edmondson Avenue in Catonsville using Google Wallet on a Nexus S 4G phone with NFC.
NFC is near field communication, an RFID technology that allows communication and data exchange between two devices in close proximity, e.g., within a few inches.
Several current smartphones have NFC chips including the Samsung's Google-branded Nexus S 4G and more are expected to include it in the coming months and years.
The first, and perhaps most significant, use of NFC will be enabling mobile phones to serve as "virtual credit cards", especially for small amounts that don't require a signature. The range of potential applications is much greater and will no doubt evolve as mobile NFC-enabled devices become ubiquitous.
Buying something at the CVS (OK, … it was candy) this way was fun. My phone made satisfying noises as it talked to CVS's payment station and the clerk, who had not had anyone use a NFC device, was properly mystified. Using it was marginally easier than swiping a credit card, but maybe even a small amount of increased convenience is worth it for such an everyday transaction.
One limitation of Google Wallet is that it currently only works with Sprint on a Nexus S 4G and with either a Citi® MasterCard® card or a Google Prepaid Card. You can load money into the latter with most any credit card and Google will get you started by adding $10 to it as an incentive.
By the way, for what it’s worth, I only recently realized that the robots in Philip K. Dick’s novel “Do Androids Dream of Electric Sheep?” were called androids and the dangerously independent new model was the Nexus-6, developed by designed by the Tyrell Corporation.
Pervasive, context-aware computing technologies can significantly enhance and improve the coming generation of devices and applications for consumer electronics as well as devices for work places, schools and hospitals. Context-aware cognitive support requires activity and context information to be captured, reasoned with and shared across devices — efficiently, securely, adhering to privacy policies, and with multidevice interoperability.
The AAAI-11 conference will host a two-day workshop on Activity Context Representation: Techniques and Languages focused on techniques and systems to allow mobile devices model and recognize the activities and context of people and groups and then exploit those models to provide better services. The workshop will be held on August 7th and 8th in San Francisco as part of AAAI-11, the Twenty-Fifth Conference on Artificial Intelligence. Submission of research papers and position statements are due by 22 April 2011.
The workshop intends to lay the groundwork for techniques to represent context within activity models using a synthesis of HCI/CSCW and AI approaches to reduce demands on people, such as the cognitive load inherent in activity/context switching, and enhancing human and device performance. It will explore activity and context modeling issues of capture, representation, standardization and interoperability for creating context-aware and activity-based assistive cognition tools with topics including, but not limited to the following:
Activity modeling, representation, detection
Context representation within activities
Semantic activity reasoning, search
Security and privacy
Information integration from multiple sources, ontologies
There are three intended end results of the workshop: (1) Develop two-three key themes for research with specific opportunities for collaborative work. (2) Create a core research group forming an international academic and industrial consortium to significantly augment existing standards/drafts/proposals and create fresh initiatives to enable capture, transfer, and recall of activity context across multiple devices and platforms used by people individually and collectively. (3) Review and revise an initial draft of structure of an activity context exchange language (ACEL) including identification of use cases, domain-specific instantiations needed, and drafts of initial reasoning schemes and algorithms.
An issue on Reasoning with context in the Semantic Web seeks papers by June 15, 2011 and will be published in the Spring of 2012. The special issue will be edited by Alan Bundy and Jos Lehmann of the University of Edinburgh and Ivan Varzinczak of the Meraka Institute.
An issue on The Semantic Web in a Mobile World will accept submission until October 1, 2011 and will be published in September 2012. The special issue will be edited by Ansgar Scherp of the University of Koblenz-Landau and Anupam Joshi of the University of Maryland, Baltimore County.
The next iphone is rumored to have something similar.
Support for NFC in popular smart phones could unleash lots of interesting applications, many of which have already been explored in research prototypes in labs around the world. One interesting possibility is that this could be used to allow android devices to share RDF queries and data with other devices.
“Nokia plans to add antennas and RFID communications chips into its phones soon, and Apple has been patenting the heck out of the idea, but both companies were probably going to rely on an in-phone antenna loop. It seems increasingly certain Apple is going to bring RFID into common usage with the iPhone for 2011 (the iPhone 5) because there’s a new patent that shows just how far Apple has gone with design thinking for RFID. The patent shows how an RFID loop, powerful enough to act as both RFID tag or a tag-reader, can actually be built right into the complex layered circuitry of the iPhone (or iPod Touch) screen. We know Apple is fond of highly-polished design and integration, and this innovation is no exception. The screen has to be exposed by its very nature, which is good for RFID purposes — the wireless signal is unobstructed by other bulk in the smartphone, and it frees up Apple to do what it likes with the rest of the phone’s design.”
Maybe building RFID into smart phones will finally unleash the potential the technology offers for cool people oriented applications, as opposed to boring inventory management tasks. However, I don’t like the idea of not being able to use my credit card because my phone ran out of power.