Semantic Interpretation of Structured Log FilesTweet
Authors: Piyush Nimbalkar
Date: August 01, 2015
Abstract: Log files comprise a record of different events happening in various applications, operating systems and even in network devices. Originally they were used to record in- formation for diagnostic and debugging purposes. Nowadays, logs are also used to track events which can be used in auditing and forensics in case of malicious activities or sys- tems attacks. Various softwares like intrusion detection systems, webservers, anti-virus and anti-malware systems, firewalls and network devices generate logs with useful information, that can be used to protect against such system attacks. Analyzing log files can help in pro- actively avoiding attacks against the systems. While there are existing tools that do a good job when the format of log files is known, the challenge lies in cases where log files are from unknown devices and of unknown formats. We propose a framework that takes any log file and automatically gives out a seman- tic interpretation as a set of RDF Linked Data triples. The framework splits a log file into columns using regular expression-based or dictionary-based classifiers. Leveraging and modifying our existing work on inferring the semantics of tables, we identify every col- umn from a log file and map it to concepts either from a general purpose KB like DBpedia or domain specific ontologies such as IDS. We also identify relationships between vari- ous columns in such log files. Converting large and verbose log files into such semantic representations will help in better search, integration and rich reasoning over the data.
Publisher: University of Maryland, Baltimore County
Google Scholar: search
Number of downloads: 416
Available for download as