US flag   Official website of the Department of Homeland Security

Research Data Repository

The DHS Science and Technology Directorate (S&T) is sponsoring an initiative to facilitate the accessibility of computer and network operational data for use in cybersecurity defensive R&D. The PREDICT (Protected Repository for the Defense of Infrastructure Against Cyber Threats) initiative represents an important three-way partnership among government, critical information infrastructure providers, and security development communities (both academic and commercial), all of whom seek technical solutions to protect the public and private information infrastructure. The primary goal of PREDICT is to bridge the gap between producers of security-relevant network operations data and technology developers and evaluators who can leverage this data to accelerate the design, production, and evaluation of next-generation cybersecurity solutions.

Technology developers and evaluators often determine the efficacy of their technical solutions on anecdotal evidence or small-scale test experiments, rather than on more comprehensive real-world data. PREDICT provides them with regularly updated network operations data sources relevant to cybersecurity defense technology development. The data include sources that are minimally anonymized, if not entirely uncensored. PREDICT is intended to provide timely and detailed insight into cyberattack phenomena occurring across the Internet, and in some cases will reveal the effects of these attacks on networks that are owned or managed by the data producers.

The PREDICT website contains an overview, general background information, and the data repository. Basic categories of datasets include those relating to IP packet headers and Internet topology data. Descriptions of the specific categories are provided along with descriptors relating to the fields of the individual datasets. As specified on the website, access to the PREDICT data repository is available to eligible research groups upon approval of their applications. In addition, new sources of data are continually being sought.

Considerable effort has been devoted within the PREDICT community to ensuring the privacy of individuals and organizations with respect to the contents of the data repository. The DHS PREDICT Privacy Impact Assessment document is available, and represents a significant proactive analysis of the privacy concerns and what measures are needed to confront them.

Back to Top

Back to Top