Archive for May, 2016
May 24th, 2016, by Tim Finin, posted in cloud computing, cybersecurity, Privacy, Security, Semantic Web
Vaishali Narkhede, Karuna Pande Joshi, Tim Finin, Seung Geol Choi, Adam Aviv and Daniel S. Roche, Managing Cloud Storage Obliviously
, International Conference on Cloud Computing, IEEE Computer Society, June 2016.
Consumers want to ensure that their enterprise data is stored securely and obliviously on the cloud, such that the data objects or their access patterns are not revealed to anyone, including the cloud provider, in the public cloud environment. We have created a detailed ontology describing the oblivious cloud storage models and role based access controls that should be in place to manage this risk. We have developed an algorithm to store cloud data using oblivious data structure defined in this paper. We have also implemented the ObliviCloudManager application that allows users to manage their cloud data by validating it before storing it in an oblivious data structure. Our application uses role-based access control model and collection based document management to store and retrieve data efficiently. Cloud consumers can use our system to define policies for storing data obliviously and manage storage on untrusted cloud platforms even if they are unfamiliar with the underlying technology and concepts of oblivious data structures.
May 22nd, 2016, by Tim Finin, posted in cloud computing, KR, Ontologies, Semantic Web
With the increase in the number of cloud services and service providers, manual analysis of Service Level Agreements (SLA), comparison between different service offerings and conformance regulation has become a difficult task for customers. Cloud SLAs are policy documents describing the legal agreement between cloud providers and customers. SLA specifies the commitment of availability, performance of services, penalties associated with violations and procedure for customers to receive compensations in case of service disruptions. The aim of our research is to develop technology solutions for automated cloud service management using Semantic Web and Text Mining techniques. In this paper we discuss in detail the challenges in automating cloud services management and present our preliminary work in extraction of knowledge from SLAs of different cloud services. We extracted two types of information from the SLA documents which can be useful for end users. First, the relationship between the service commitment and financial credit. We represented this information by enhancing the existing Cloud service ontology proposed by us in our previous research. Second, we extracted rules in the form of obligations and permissions from SLAs using modal and deontic logic formalizations. For our analysis, we considered six publicly available SLA documents from different cloud computing service providers.
May 20th, 2016, by Tim Finin, posted in Social media
Profile linking is the ability to connect profiles of a user on different social networks. Linked profiles can help companies like Disney to build psychographics of potential customers and segment them for targeted marketing in a cost-effective way. Existing methods link profiles by observing high similarity between most recent (current) values of the attributes like name and username. However, for a section of users observed to evolve their attributes over time and choose dissimilar values across their profiles, these current values have low similarity. Existing methods then falsely conclude that profiles refer to different users. To reduce such false conclusions, we suggest to gather rich history of values assigned to an attribute over time and compare attribute histories to link user profiles across networks. We believe that attribute history highlights user preferences for creating attribute values on a social network. Co-existence of these preferences across profiles on different social networks result in alike attribute histories that suggests profiles potentially refer to a single user. Through a focused study on username, we quantify the importance of username history for profile linking on a dataset of real-world users with profiles on Twitter, Facebook, Instagram and Tumblr. We show that username history correctly links 44% more profile pairs with non-matching current values that are incorrectly unlinked by existing methods. We further explore if factors such as longevity and availability of username history on either profiles affect linking performance. To the best of our knowledge, this is the first study that explores viability of using an attribute history to link profiles on social networks.
May 12th, 2016, by Tim Finin, posted in Datamining, High performance computing, Machine Learning, NLP
Topic Modeling for Analyzing Document Collection
Computer Science, University of Miami
11:00am Monday, 16 May 2016, ITE 325b, UMBC
Topic modeling (in particular, Latent Dirichlet Analysis) is a technique for analyzing a large collection of documents. In topic modeling we view each document as a frequency vector over a vocabulary and each topic as a static distribution over the vocabulary. Given a desired number, K, of document classes, a topic modeling algorithm attempts to estimate concurrently K static distributions and for each document how much each K class contributes. Mathematically, this is the problem of approximating the matrix generated by stacking the frequency vectors into the product of two non-negative matrices, where both the column dimension of the first matrix and the row dimension of the second matrix are equal to K. Topic modeling is gaining popularity recently, for analyzing large collections of documents.
In this talk I will present some examples of applying topic modeling: (1) a small sentiment analysis of a small collection of short patient surveys, (2) exploratory content analysis of a large collection of letters, (3) document classification based upon topics and other linguistic features, and (4) exploratory analysis of a large collection of literally works. I will speak not only the exact topic modeling steps but also all the preprocessing steps for preparing the documents for topic modeling.
Mitsunori Ogihara is a Professor of Computer Science at the University of Miami, Coral Gables, Florida. There he directs the Data Mining Group in the Center for Computational Science, a university-wide organization for providing resources and consultation for large-scale computation. He has published three books and approximately 190 papers in conferences and journals. He is on the editorial board for Theory of Computing Systems and International Journal of Foundations of Computer Science. Ogihara received a Ph.D. in Information Sciences from Tokyo Institute of Technology in 1993 and was a tenure-track/tenured faculty member in the Department of Computer Science at the University of Rochester from 1994 to 2007.
May 8th, 2016, by Tim Finin, posted in cybersecurity, Machine Learning, Security
Vehicles can be considered as a specialized form of Cyber Physical Systems with sensors, ECU’s and actuators working together to produce a coherent behavior. With the advent of external connectivity, a larger attack surface has opened up which not only affects the passengers inside vehicles, but also people around them. One of the main causes of this increased attack surface is because of the advanced systems built on top of old and less secure common bus frameworks which lacks basic authentication mechanisms. To make such systems more secure, we approach this issue as a data analytic problem that can detect anomalous states. To accomplish that we collected data flowing between different components from real vehicles and using a Hidden Markov Model, we detect malicious behaviors and issue alerts, while a vehicle is in operation. Our evaluations using single parameter and two parameters together provide enough evidence that such techniques could be successfully used to detect anomalies in vehicles. Moreover our method could be used in new vehicles as well as older ones.
May 7th, 2016, by Tim Finin, posted in cloud computing, NLP
May 2nd, 2016, by Tim Finin, posted in KR, Ontologies, Semantic Web
He’s dead, Jim.
Google recently shut down the query interface to Freebase. All that is left of this innovative service is the ability to download a few final data dumps.
Freebase was launched nine years ago by Metaweb as an online source of structured data collected from Wikipedia and many other sources, including individual, user-submitted uploads and edits. Metaweb was acquired by Google in July 2010 and Freebase subsequently grew to have more than 2.4 billion facts about 44 million subjects. In December 2014, Google announced that it was closing Freebase and four months later it became read-only. Sometime this week the query interface was shut down.
I’ve enjoyed using Freebase in various projects in the past two years and found that it complemented DBpedia in many ways. Although its native semantics differed from that of RDF and OWL, it was close enough to allow all of Freebase to be exported as RDF. Its schema was larger than DBpedia’s and the data tended to be a bit cleaner.
Google generously decided to donate the data to the Wikidata project, which began migrating Freebase’s data to Wikidata in 2015. The Freebase data also lives on as part of Google’s Knowledge Graph. Google recently allowed very limited querying of its knowledge graph and my limited experimenting with it suggests that has Freebase data at its core.
May 1st, 2016, by Tim Finin, posted in KR, NLP, Ontologies, Semantic Web
Representing and Reasoning with Temporal
Properties/Relations in OWL/RDF
10:30-11:30 Monday, 2 May 2016, ITE346
OWL ontologies offer the means for modeling real-world domains by representing their high-level concepts, properties and interrelationships. These concepts and their properties are connected by means of binary relations. However, this assumes that the model of the domain is either a set of static objects and relationships that do not change over time, or a snapshot of these objects at a particular point in time. In general, relationships between objects that change over time (dynamic properties) are not binary relations, since they involve a temporal interval in addition to the object and the subject. Representing and querying information evolving in time requires careful consideration of how to use OWL constructs to model dynamic relationships and how the semantics and reasoning capabilities within that architecture are affected.