The Information Marketplace for Policy and Analysis of Cyber-risk & Trust (IMPACT) project supports the global cyber-risk research community by coordinating and developing real-world data and information-sharing capabilities—tools, models and methodologies. To accelerate solutions around cyber-risk issues and infrastructure security, the IMPACT project enables empirical data and information-sharing between and among the global cybersecurity research and development (R&D) community in academia, industry and government. Importantly, IMPACT also addresses the cybersecurity decision-analytic needs of Homeland Security Enterprise (HSE) customers in the face of high volume, high-velocity, high-variety and/or high-value data through its network of Decision Analytics-as-a-Service Providers (DASP). These resources are a service technology or tool capable of supporting the following types of analytics: descriptive (what happened), diagnostic (why it happened), predictive (what will happen) and prescriptive (what should happen).
The Department of Homeland Security (DHS) Science and Technology Directorate’s (S&T) seeks to coordinate, enhance and develop advanced data and information-sharing tools, datasets, technologies, models, methodologies and infrastructure to strengthen the capabilities of national and international cyber-risk R&D. These data-sharing components are intended to be broadly available as national and international resources to bridge the gap between producers of cyber-risk-relevant ground truth data, academic and industrial researchers, cybersecurity technology developers, and decision-makers to inform their analysis of and policymaking on cyber-risk and trust issues.
Cybersecurity R&D requires real-world data to develop advanced knowledge, test products and technologies and prove the utility of research in large-scale network environments. Established and funded by S&T, the predecessor project–Protected Repository for the Defense of Infrastructure Against Cyber Threats (PREDICT)—was the only publicly available, legally collected, distributed repository of large-scale datasets containing real network and system traffic that could be used to advance state-of-the-art cybersecurity R&D. The centralized brokering and distributed provisioning between the data providers, data hosts and researchers addresses the operational, trust and administrative costs and challenges that impede sustainable and scalable data-sharing. IMPACT continually adds new data that is responsive to cyber-risk management (e.g., attacks and measurements) to provide the R&D community timely, high-value information to enhance research innovation and quality. The IMPACT model also serves as a laboratory for testing various data-sharing models, including batch transfers, newer Data Analytics as a Service (DaaS) and visualization techniques.
IMPACT consists of four components supporting core functional requirements for data-sharing: metadata discovery, data and tool matchmaking, trusted brokering and a social feedback loop.
- Metadata Indexing (Find)—An open, comprehensive, centralized and standardized interface and engine to access metadata from a federation of providers and hosts.
- Data and Tools Matchmaking (Request and Use)—Standardized policies and procedures that connect researchers with a federation of providers and hosts, and a central interface and process to discover and access data and the tools that can analyze and/or use the data from within and outside of IMPACT.
- Administrative, Legal, Ethical Brokering (Coordinate)—A centralized interface, policies and procedures to request datasets from a federation of providers and hosts; vetted data source provenance; and mediated access entitlement so sensitive data is shared with legitimate researchers.
- Social Networking (Feedback Loop)—A central platform for exchanging feedback between providers, hosts, researchers and domain experts that helps improve and optimize data, tools, analytics and collective knowledge.
Carnegie-Mellon University: Measuring and analyzing online anonymous (darknet') marketplaces
This project will build and deploy queryable online platforms for online crime repositories. It will primarily focus on two types of data: anonymous online marketplace data and search-redirection attack corpora, which are primarily used for attracting customers to illicit or fraudulent websites. This effort will build and deploy simple web-based graphical interfaces, accessible to partner institutions and researchers at no cost, integrating both types of data into IMPACT.
Galois: The Framework for Information Disclosure with Ethical Security (FIDES)
This effort is a scalable, fine-grained, technical disclosure control system for IMPACT datasets. FIDES reduces risk for data providers by keeping non-anonymized data cryptographically secure for its entire lifetime. Neither end-users nor malicious adversaries can access such data “in the clear” at any time. Also, FIDES provides high utility for analyses that require direct access to sensitive details in the data, a capability not achievable with existing pre-anonymized approaches.
Georgia Tech Research Institute: Real-world, Large Scale Network- and Host-level Threat Intelligence
By leveraging its extensive malware analysis experience to facilitate the availability of real-world, large scale malware analysis datasets, this effort enables the design, production, and evaluation of next-generation cybersecurity solutions by offering hard-to-find, high-value data curated by experienced personnel in a manner that is mindful of the associated legal and ethical risks, which lowers the barrier to entry to cybersecurity R&D.
Massachusetts General Hospital: Healthcare Data Generation and Curation for Cybersecurity Analysis
The lack of medical device data cyber-curation is impeding the development of critical capabilities needed for the cyber protection of hospital clinical environments. This effort is addressing these needs by generating diverse medical device datasets in a simulated hospital environment and establishing a medical device data repository for use by IMPACT researchers to develop monitoring rulesets and tools based on changes in network behavior under normal clinical operations and abnormal circumstances.
Parsons: Internet Risk Assessment and Mitigation (I-RAM)
It is important organizations recognize their level of exposure to threats from attacks on the internet infrastructure even when they do not control all the components that make the interconnection of their systems possible. This effort enables an organization to examine its exposure to internet infrastructure risks in a systematic manner and take actions to mitigate such risks.
University of California San Diego, Center for Applied Internet Data Analysis (CAIDA): Advancing Scientific Study of Internet Security and Topological Stability (ASSISTS)
Large-scale, internet cyberattacks and incidents represent a major threat to public safety and public and private strategic and financial assets. This effort supports efforts to counter these threats by developing datasets that already have proven relevant to target cybersecurity challenge problems. It also will provide new DaaScapabilities by providing an interactive interface for users to fuse datasets that reflect immediate threats, vulnerabilities and hazards to communications infrastructures.
University of Southern California Information Science Institute (USC-ISI): Los Angeles/Colorado Application and Network Information Community
Cyber datasets are essential to develop new research and products that improve cybersecurity defenses. Datasets help evaluate defenses against Distributed Denial-of-Service (DDoS) attacks, detection of route hijacking and to understanding the internet topology to improving its resiliency and plan deployments of services with Anycast (i.e. data from a single sender routed to any one of several destination nodes) and content delivery networks. This effort will support the IMPACT program through the creation of foundational and derived datasets. It is also responsible for creating web-based services and installable tools that also support this effort.
University of Wisconsin: Datasets, Methods and Tools for Internet Security Decision Analytics
This effort addresses the challenges of collecting, organizing and hosting unique and diverse data sets, and creating techniques and tools to support and enable decision analytics for HSE. It will create new capabilities to collect and fuse data across different layers of the network protocol stack, and to develop new techniques and tools that provide unique, spatio-temporal perspectives and insights on risks and behaviors across the operational landscape of the Internet.
Inferlink: Advanced Indexing and Search for Efficient Information Discovery
This effort will support the program by extending InferLink's ActiveSearch system to the IMPACT portal. The work will focus on three broad goals. First, it will extend ActiveSearch to support searches through collections of resources. Second, it will investigate approaches to broaden the ways users can find relevant data, beyond a traditional search. Third, the project will evaluate the performance of the technology, including precision, recall, and response time.