UMBC eBiquity Blog
Tim Finin, 6:32pm 6 April 2015
In this week’s meeting, Sandeep Nair will talk about his work on ‘Preventing SQLIA and OJVMWCU, a web service utility for Oracle RDBMS‘ at 10:00am Tuesday, 7 April 2015 in ITE 346.
SQL Injection attacks have a long history dating back to 1999, but OWASP still maintains Injection attacks, which includes SQLIA, as the top rated vulnerability, due to the simplicity to perform and the high impact it can cause. SIAP is a project aimed at an automated attempt to secure ASP .NET with C# based web applications. The second tool OjvmWCU is a tool which is released with Oracle RDBMS 12.1, which allows users to call SOAP based web services using PLSQL!
Tim Finin, 12:08am 25 February 2015
Ph.D. Dissertation Proposal
User Identification in Wireless Networks
9:00-11:00pm Friday, 27 February 2015, ITE 325B
Wireless communication using the 802.11 specifications is almost ubiquitous in daily life through an increasing variety of platforms. Traditional identification and authentication mechanisms employed for wireless communication commonly mimic physically connected devices and do not account for the broadcast nature of the medium. Both stationary and mobile devices that users interact with are regularly authenticated using a passphrase, pre-shared key, or an authentication server. Current research requires unfettered access to the user’s platform or information that is not normally volunteered.
We propose a mechanism to verify and validate the identity of 802.11 device users by applying machine learning algorithms. Existing work substantiates the application of machine learning for device identification using Commercial Off-The-Shelf (COTS) hardware and algorithms. This research seeks the refinement of and investigation of features relevant to identifying users. The approach is segmented into three main areas: a data ingest platform, processing, and classification.
Initial research proved that we can properly classify target devices with high precision, recall, and ROC using a sufficiently large real-world data set and a limited set of features. The primary contribution of this work is exploring the development of user identification through data observation. A combination of identifying new features, creating an online system, and limiting user interaction is the objective. We will create a prototype system and test the effectiveness and accuracy of it’s ability to properly identify users.
Committee: Drs. Joshi (Chair/Advisor), Nicholas, Younis, Finin, Pearce, Banerjee
Tim Finin, 9:29pm 15 February 2015
ACM Tech Talk
Studying Internet Latency via TCP Queries to DNS
Dr. Yannis Labrou
Principal Data Architect, Verisign
1:30-2:30pm Friday, 27 February 2015, ITE 456, UMBC
Every day Verisign processes upwards of 100 billion authoritative DNS requests for .COM and .NET from all corners of the earth. The vast majority of these requests are via the UDP protocol. Because UDP is connectionless, it is impossible to passively estimate the latency of the UDP-based requests. A very small percentage of these requests though, are over TCP, thus providing the means to estimate the latency of specific requests and paths for a subset of the hosts that interact with Verisign’s network infrastructure.
In this work, we combine this relatively small number of datapoints from TCP (on the order of a few hundred million per day) with the much larger dataset of all DNS requests. Our focus is the process of data analysis of real world, imperfect data at very large scale with the goals of understanding network latency at an unprecedented magnitude, identifying large volume, high latency clients and improving their latency. We discuss the techniques we used for data selection and analysis and we present the results of a variety of analyses, such as deriving regional and country patterns, estimations for query latency for different countries and network locations, and techniques for identifying high latency clients.
It is important to note that latency results we will report are based on passive measurements from, essentially, the entire Internet. For this experiment we do not have control over the client side — where they are, which software, their configuration, their network congestion. This is significantly different from latency studied in any active measurement infrastructure such as Planet Lab, RIPE Atlas, Thousand Eyes, Catchpoint, etc.
Dr. Yannis Labrou is Principal Data Architect at Verisign Labs where he leads efforts to create value from the wealth of data that Verisign’s operations generate every day. He brings to Verisign 20 years of experience in conceiving, creating and bringing to fruition innovations; combining thinking big with laboring through the pains of materializing ideas. He has done so in an academic environment, at a startup company, while conducting government and DoD/DARPA sponsored research and for a global Fortune 200 company.
Before joining Verisign, Dr. Labrou was a Senior Researcher at Fujitsu Laboratories of America, Director of Technology and member of the executive staff of PowerMarket, an enterprise application software start-up company and a Research Assistant Professor at UMBC. He received his Ph.D. in Computer Science from UMBC, where his research focused on software agents, and a Diploma in Physics from the University of Athens, Greece. He has authored more than 40 peer-reviewed publications, with almost 4000 citations and he has been awarded 14 patents from the USPTO. His current research focus is data through the entire lifecycle from generation to monetization.
– more information and directions: http://bit.ly/UMBCtalks –
Prajit Kumar Das, 12:33am 27 January 2015
In this post we will talk about certain User Interface (UI) technological advances that we are observing at the moment. One such development was revealed in a recent media event conducted by Microsoft, where they announced the Microsoft HoloLens, a computing platform which achieves seamless connection between the digital and the physical world, quite similar to the experience referred to in certain movies in the past.
It is interesting to note that the design of the HoloLens device looks so similar to something we have seen before.
Even the vision of holographic computing and users interacting with such interfaces isn’t a new one. The 2002 movie “The first $20 million is always the hardest” was possibly the first time we saw how such a futuristic technology might look like.
How did we reach here? A brief discussion on UIs…
User interfaces have always been an important aspect of computers. In its early days computers had a monochromatic screen (or at-most a duo-chromatic screen). A user would type in commands into the screen and computers would execute said commands. Since the commands would be entered in a single or a series of lines, this interface was called the Command-Line Interface (CLI).
Command Line based UI
Such an interface was not particularly intuitive as you had to know the list of commands that would fulfill a certain task. Albeit a certain group of individuals i.e. geeks and some computer programmers, like me, prefer such an interface owing to its clean and distraction free nature. However, owing to the learning curve of CLIs, researchers at Stanford Research Institute and Xerox PARC research center invented a new User interface called the Graphical User Interface (GUI). There were a few variations of the GUIs for example the point and click type also known as WIMP (windows, icons, menus, pointer) UI created at the Xerox PARC research center and made popular by Apple through it’s Macintosh operating systems
Apple’s Macintosh UI
And also adopted by Microsoft in its Windows operating systems
Microsoft’s Windows UI
Some early versions even included a textual user interface with programs which had menus that could be parsed using a keyboard instead of a mouse.
Early textual menu based UI
Eventually new avenues were created for UI research. Continuing onwards from textual interfaces to the WIMP interfaces to the world wide web where objects on the web became entities accessible through a Uniform Resource Identifier (URI). Such an entity could possibly have Semantics associated with them too (as defined by Web 2.0). However, with the advent of mobile smart-phones we saw a completely different class of user interfaces. The touch-based user interfaces and its more evolved cousin the multi-touch systems which allowed gesture based interactions.
Touch and gesture based UI
This was the first time in computing history that humans were able to directly interact with an object on their device with their hands instead of using an input device. The experience was immersive but yet these objects had not entered into the real world. We were on precipice of a revolution in computing.
This revolution was the mainstream launch of Wearable Technology and Virtual/Augmented Reality and Optical Head Mounted Display devices with the creation of devices like the Oculus Rift, Google Glass and EyeTap among others. These devices allowed voice inputs and created a virtual or an augmented reality world for it’s user. Microsoft too was working on gesture based interactions with the Kinect device and research in the Natural User Interface (NUI) field. Couple of interesting works worthy of taking a look from this revolution are listed below.
This talk by John Underkoffler demos a UI that we saw in the movie Minority Report. He talks about the spatial aspect of how humans interact with their world and how computers might be able to help us better if we could do the same with our computers.
Here Pranav Mistry, currently the Head of the Think Tank Team and Director of Research of Samsung Research America, speaks of SixthSense. A new paradigm in computing that allows interaction between the real world and the digital world. All these works were knocking on the doors of a computer as we saw in the 2002 movie mentioned earlier, a real life holographic computer. Enter Microsoft HoloLens!
What is Microsoft HoloLens?
Microsoft HoloLens is an augmented reality computing platform. As per the review from Forbes.com this device has taken a step beyond current work by adding to the world around its user, virtual holograms, rather than putting the user in a completely virtual environment. This device has launched a new platform of software development, i.e. Holographic apps. As well as, the device has created a scope for hardware research and development, as it requires new components like the Holographic Processing Unit or HPU. Visualization and sharing of ideas and interaction with the real world can now be done as envisioned in the TED talk by Pranav Mistry. A more natural way of interacting with digital content as envisioned in the works above are a reality now. The device tracks its user’s movements in an environment. It detects what a person is looking at and transforms the visual field by overlaying 3D objects on top of that.
What kind of applications can we expect to be developed for HoloLens?
When the touch UI became a reality developers had to change the way they worked on software. Direct object interactions as shown above had to be programmed into their applications. Apps for HoloLens would similarly need to handle use-cases of interactions involving voice commands and gesture recognition. The common ideas and their corresponding research implication that come to mind include:
- Looking up a grocery list when you enter the grocery store (context aware)
HoloLens Environment overlaid with lists
- Recording important events automatically (context aware computing)
- Recognizing people in a party (social media and privacy)
- Taking down notes, writing emails using voice commands (natural language understanding)
- Searching for “stuff” around us (nlp, data analytics, semantic web, context aware computing)
- Playing 3D games (animation and graphics)
HoloLens Environment overlaid with 3D Games
- Making sure your battery doesn’t run out (systems, hardware)
- Virtual work environments (systems)
Virtual Work Environments through HoloLens
- Teaching virtual classrooms (systems)
Why or how could it fail?
Are there any obvious pitfalls that we are not thinking about? We can be rest assured that researchers are already looking at ways this venture can fail and for Microsoft’s own good we can be certain they have a list of ways they think this might go and if there are any flaws they are surely working on fixing them. However, as a researcher in the mobile field with a bit of experience with the Google Glass, we can try to list some of the possible pitfalls of a AR/VR device. The HoloLens being a tetherless, Augmented Virtual Reality (AVR) device could possibly suffer from some of these pitfalls too. The reader should understand that we are not claiming any of the following to be scientifically provable because these are merely empirical observations.
- The first thing that worried us while using the Google Glass was that it would sometimes cause us headaches after using it for couple of hours. We have not researched the implications of using the device by any other person so this is and observation from experience. Therefore one concern could be regarding the health impact on a human being with prolonged usage of an AVR device.
- The second thing that was noticed with the Google Glass was how that the device heated up fast. We know from experience that computers do get hot. For example when we play a game they get hot or we do a lot of complex computations they get hot. An AVR device which is being used for playing games will most probably get hot too. At least the Google Glass did after recording a video. Here we are concerned about the heat dissipation and its health impact on the user.
- The third observation that we made was that the Google Glass, showed significant sluggishness when it tried to accomplish computation heavy tasks. Will the HoloLens device be able to keep up with all the computations needed for, say, playing a 3D game?
- The fourth concern is regarding battery capacity. The HoloLens is advertised as a device with no wires, cords or tethers. Anyone who has used a smartphone ever knows the issues of the battery on the devices running out within a day or even half a day. Will the HoloLens be able to carry a charge for long or will it require constant charging?
- The fifth concern that we had was regarding privacy. The Google Glass has faced quite a few privacy concerns because it can readily take pictures using a simple voice command or even a non-verbal command like a ‘wink’. We have worked on this issue as part of our research product FaceBlock. Will the HoloLens create such concerns as this device too has front facing cameras that are capturing a user’s environment and projecting an augmented virtual world to the user.
The above lists of possible issues and probable application areas are not exhaustive in anyway. There will be numerous other scenarios and ways we can work on this new computing platform. There will probably be a multitude of issues with such a new and revolutionary platform. However, the hybrid of augmented and virtual reality has just started taking small steps now. With invention of devices like the Microsoft HoloLens, Google Glass, Oculus Rift, EyeTap etc. we can look forward to an exciting period in the future of Computing for Augmented Virtual Reality.
Tim Finin, 12:32pm 25 January 2015
The fourth Mid-Atlantic Student Colloquium on Speech, Language and Learning (MASC-SLL) will he held at JHU this coming Friday, January 30. It’s a good opportunity to sample current research on language technology and machine learning, including the work of a number of UMBC students. The program for the one-day colloquium includes oral presentations, poster sessions, a panel and three breakout sessions.
The event is free and open to all, but registration is requested by Tuesday, January 27. Note that the location has been moved to the Glass Pavilion on the JHU Homewood Campus
Tim Finin, 11:40pm 20 January 2015
UMBC CSEE alumni Don Miner and Brandon Wilson have started a Meetup group for Hadoop users in and around the Baltimore area to discuss Hadoop technology and use cases.
Apache Hadoop is one of the most popular open-source tools used to harness clusters of computers to process, analyze or learn from massive amounts of data. Whether you are new to Hadoop or an experienced user, this is a great opportunity to improve your knowledge and network with others in the Baltimore computing technology community.
The first meeting will be held from 7:00pm to 9:30pm on Thursday, 19 February 2015 at AOL/Advertising.com at 1020 Hull St #100, Baltimore, MD (map). Join the group here.
Tim Finin, 11:43am 17 January 2015
Facebook’s AI Research (FAIR) group has released open-source, optimized deep-learning modules for their open sourced Torch development environment for numerics, machine learning, and computer vision, with a particular emphasis on deep learning and convolutional nets.
The release includes GPU-optimized modules for large convolutional nets and networks with sparse activations that are commonly used in NLP applications.
See fbcunn for installation instructions, documentation and examples to train classifiers and iTorch for an IPython Kernel for Torch.
Tim Finin, 1:17pm 14 January 2015
The theme of the 2015 Ontology Summit is Internet of Things: Toward Smart Networked Systems and Societies. The Ontology Summit is an annual series of events (first started by Ontolog and NIST in 2006) that involve the ontology community and communities related to each year’s theme.
The 2015 Summit will hold a virtual discourse over the next three months via mailing lists and online panel sessions augmented conference calls. The Summit will culminate in a two-day face-to-face workshop on 13-14 April 2015 in Arlington, VA. The Summit’s goal is to explore how ontologies can play a significant role in the realization of smart networked systems and societies in the Internet of Things.
The Summit’s initial launch session will take place from 12:30pm to 2:00pm EDT on Thursday, January 15th and will include overview presentations from each of the four technical tracks. See the 2015 Ontology Summit for more information, the schedule and details on how to participate in these free an open events.
Tim Finin, 8:40am 6 January 2015
Jeff Shager’s Genealogy of Eliza project has added an 1100-line Perl emulator written by James Markevitch for the 1966 version of BBN-LISP for the PDP-1 computer that can run Bernie Cosell’s original LISP version of doctor.
Markevitch writes in the comments
This is a Perl hack to implement the 1966 version of BBN-LISP for the PDP-1 computer. This was written primarily to run the 1966 LISP version of the “doctor” program (aka Eliza) written by Bernie Cosell. The intent is to be compatible with the version of LISP described in The BBN-LISP System, Daniel G. Bobrow et al, February, 1966, AFCRL-66-180 [BBN66]. However, because many of the quirks of that version of LISP are not documented, The BBN-LISP System Reference Manual April 1969, D. G. Bobrow et al [BBN69] was used as a reference. Finally, LISP 1.5 Programmer’s Manual, John McCarthy et al [LISP1.5] was also used as a reference. N.B. The 1966 version of BBN-LISP has differences from later versions and this interpreter will not properly execute programs written for those later versions.
You can download the Perl Lisp emulator, the doctor lisp code and the script file from the elizagen github repository.
Tim Finin, 3:59pm 3 January 2015
click image for higher-resolution version
Mark Liberman pointed out a nice use of pmi to explore the difference in meaning of geek vs. nerd done last year by Burr Settles using Twitter data.
Settles’s original post, On “Geek” Versus “Nerd”, has a brief, but good, explanation of the method and data.
Tim Finin, 7:07pm 29 December 2014
TABEL — A Domain Independent and Extensible Framework
for Inferring the Semantics of Tables
8:00am Thursday, 8 January 2015, ITE325b
Tables are an integral part of documents, reports and Web pages in many scientific and technical domains, compactly encoding important information that can be difficult to express in text. Table-like structures outside documents, such as spreadsheets, CSV files, log files and databases, are widely used to represent and share information. However, tables remain beyond the scope of regular text processing systems which often treat them like free text.
This dissertation presents TABEL — a domain independent and extensible framework to infer the semantics of tables and represent them as RDF Linked Data. TABEL captures the intended meaning of a table by mapping header cells to classes, data cell values to existing entities and pair of columns to relations from an given ontology and knowledge base. The core of the framework consists of a module that represents a table as a graphical model to jointly infer the semantics of headers, data cells and relation between headers. We also introduce a novel Semantic Message Passing scheme, which incorporates semantics into message passing, to perform joint inference over the probabilistic graphical model. We also develop and explore a “human-in-the-loop” paradigm, presenting plausible models of user interaction with our framework and its impact on the quality of inferred semantics.
We present techniques that are both extensible and domain agnostic. Our framework supports the addition of preprocessing modules without affecting existing ones, making TABEL extensible. It also allows background knowledge bases to be adapted and changed based on the domains of the tables, thus making it domain independent. We demonstrate the extensibility and domain independence of our techniques by developing an application of TABEL in the healthcare domain. We develop a proof of concept for an application to generate meta-analysis reports automatically, which is built on top of the semantics inferred from tables found in medical literature.
A thorough evaluation with experiments over dataset of tables from the Web and medical research reports presents promising results.
Committee: Drs. Tim Finin (chair), Tim Oates, Anupam Joshi, Yun Peng, Indrajit Bhattacharya (IBM Research) and L. V. Subramaniam (IBM Research)
Tim Finin, 10:19pm 21 December 2014
Jeff Shager’s Genealogy of Eliza project has added a BBN LISP version of DOCTOR from 1966 that was recovered from a paper tape. Eliza is the classic conversational program written by Joseph Weizenbaum and and described in a 1966 CACM paper, “ELIZA–a computer program for the study of natural language communication between man and machine“. Weizenbaum wrote Eliza in his Lisp-like SLIP programming language, which ran on an IBM 7094 computer.
BBNer Bernie Cosell wrote this first Lisp version in BBN LISP and based it on the description and examples he read in the CACM paper. The recovered code is in Jeff’s github repository and an emulator that can run it is promised soon.
This is probably pretty close to the MACLISP version of DOCTOR that I played with in the early 1970s. I still have some DECtapes with old files from those days — maybe I’ll find that version of DOCTOR on one of them.