Automated Data Augmentation via Wikidata Relationships
Oyesh Singh, UMBC 10:30-11:30 Monday, 21 October 2019, ITE 346
With the increase in complexity of machine learning models, there is more need for data than ever. In order to fill this gap of annotated data-scarce situation, we look towards the ocean of free data present in Wikipedia and other WIkimedia resources. Wikipedia has an enormous amount of data in many languages along with the knowledge graph defined in Wikidata. In this presentation, I will explain how we utilized the Wikipedia/Wikidata data to boost the performance of BERT models for named entity recognition.
With increasing adoption of Cloud Computing, cyber attacks have become one of the most effective means for adversaries to inflict damage. To overcome limitations of existing blacklists and whitelists, our research focuses to develop a dynamic reputation scoring model for sessions based on a variety of observable and derived attributes of network traffic. Here we propose a technique to greylist sessions using observables like IP, Domain, URL and File Hash by scoring them numerically based on the events in the session. This enables automatic labeling of possible malicious hosts or users that can help in enriching the existing whitelists or blacklists.
A Hands-on Introduction to TensorFlow and Machine Learning
Abhay Kashyap, UMBC ebiquity Lab
10:00-11:00am Tuesday, 28 March 2017, ITE346 ITE325b
As many of you know, TensorFlow is an open source machine learning library by Google which simplifies building and training deep neural networks that can take advantage of computers with GPUs. In this meeting, I will introduce some basic concepts of TensorFlow and machine learning in general. This will be a hands on tutorial where we will sit and code up some basic examples in TensorfFow. Specifically, we will use TensorFlow to implement linear regression, softmax classifiers and feed forward neural networks (MLP). You can find the Python notebooks here. If time permits, we will go over the implementation of the popular word2vec algorithm and introduce LSTMs to build language models.
What you need to know: Python and the basics of linear algebra and matrix operations. While it helps to know basics of machine learning, no prior knowledge will be assumed and there will be a gentle high level introduction to the algorithms we will implement.
What you need to bring: A laptop that has Python and pip installed. Having virtual environments set up on your computer is also a plus. (Warning: Windows-only users might be publicly shamed)