Cognitively Rich Framework to Automate Extraction and Representation of Legal Knowledge


With the explosive growth in cloud based services, businesses are increasingly maintaining large datasets containing information about their consumers to provide a seamless user experience. To ensure privacy and security of these datasets, regulatory bodies have speci ed rules and compliance policies that must be adhered to by organizations. These regulatory policies are currently available as text documents that are not machine processable and so require extensive manual e ort to monitor them continuously to ensure data compliance. We have developed a cognitive framework to automatically parse and extract knowledge from legal documents and represent it using an Ontology. The framework captures knowledge in form of key terms, rules, topic summaries, relationships between various legal terms, semantically similar terminologies, deontic expressions, and cross-referenced legal facts and rules. We built the framework using Deep Learning technologies like Tensorflow, for word embeddings and text summarization, Gensim for topic modeling and Semantic Web technologies for building the knowledge graph. We have applied this framework to the United States government's Code of Federal Regulations (CFR) which includes facts and rules for individuals and organizations seeking to do business with the US Federal government. In this paper, we describe our framework in detail and present results of the CFR legal knowledge base that we have built using this framework. Our framework can be adopted by businesses to build their automated compliance monitoring system.

  • 912437 bytes

compliance, deep learning, legal text analytics, semantic web


University of Maryland Baltimore County

1000 Hilltop Circle

Downloads: 324 downloads

UMBC ebiquity