ICPSR Collection Development Policy

Executive Summary

ICPSR maintains an extensive archive of data to support research and knowledge building in the social and behavioral sciences. This policy sets forth a description of the characteristics of data that ICPSR has an interest in adding to the collection. ICPSR intentionally casts a broad net in order to add a wide range of data that would be of interest to the diverse fields representing the social and behavioral sciences. However, the organization also applies additional appraisal criteria to determine the appropriate level of curatorial investment that ICPSR will make to ensure long-term and effective use of the data.

By balancing broad interests of the social and behavioral sciences with more focused investment in curation, ICPSR will add a wide array of social and behavioral science data to the archive while using member resources strategically to meet the needs of the research community and the broad public.

Commitment to Diversity, Equity, and Inclusion

In the changing social and behavioral science landscape, ICPSR recognizes the importance of remaining socially responsible and providing equitable access to a diverse range of data; this means maintaining an unwavering commitment to diversity, equity, accessibility, and inclusion in our collection development activities.

As a social institution, ICPSR has a "responsibility to advance learning and research that diminishes access barriers and brings the narratives and research of often-marginalized populations to the forefront...Our goal is to encourage an equitable perspective in the research community through ICPSR's practice of acquiring and disseminating data" (ICPSR Diversity Statement). Accordingly, the practices outlined in this policy were written with the core goal of collecting data that embody DEI.

ICPSR is a domain repository that seeks to acquire, archive, and disseminate data of interest to researchers, broadly defined, in the social and behavioral sciences. Our definition of the domain is necessarily broad as we recognize the value of many disciplines to address key questions about the human experience in all its diversity and richness.

To guide the growth and management of ICPSR's collection, we have created a policy that offers clear goals but also a degree of flexibility to enable ICPSR to respond to changes in the research environment. We present the policy here to inform ICPSR membership, users, core partners, prospective depositors, and potential funders of the principles governing our collection development activities.

The Inter-university Consortium for Political and Social Research began to build a collection of data to be shared across its member institutions in 1962. The early archive included the American National Election Study and other sample survey data. By the 1970s, several large-scale social science surveys, including some that were conducted by the various centers of the Institute for Social Research at the University of Michigan, were added. ICPSR's collection of survey data strengthened in the decades following. The archive also expanded in new ways, partly due to Federal grants and contracts awarded to ICPSR to archive special collections, such as data on criminal justice and aging. In later years, additional topical archives were added: substance/drug use, HIV, education (including early education and childcare), health and medical care, disability and rehabilitation studies, demography, racial and ethnic minorities, and several others. (See the thematic collections page for a full listing).

Over time, ICPSR began to add new data types from a wide range of quantitative and qualitative methods used across the social and behavioral sciences. As ICPSR's capacity for curating new kinds of data about the human experience has grown, there has been simultaneous, albeit sometimes slow, growth in the culture of data sharing throughout the scientific research community. ICPSR has positioned itself to invite data from a full range of social, behavioral, and increasingly health science disciplines in recognition that social relationships and status are closely intertwined with human behavior, biological processes, cognition, and clinical experiences, to name but a few.

Data from ICPSR are used primarily by the academic research community, which includes researchers and students around the world. ICPSR data and data-related products and services are also used by policymakers, consultants, service providers, journalists, and other professionals who are interested in building and analyzing evidence to help shape, inform, and evaluate policy. As ICPSR continues to increase its supply of open-access data with support from third-party funders, the broader public, beyond ICPSR's membership, will increasingly be able to take advantage of ICPSR data.

  • Anthropology Archeology
  • Area Studies
  • Arts & Humanities
  • Behavioral Health
  • Communication
  • Criminology
  • Data Science
  • Demography
  • Disability Studies
  • Economics
  • Education
  • Environmental Science
  • Epidemiology
  • Ethnic Studies
  • Gender Studies
  • Geography
  • Gerontology
  • Government
  • Health Care
  • History
  • Human Development
  • Information Science
  • Journalism
  • Law
  • Medicine
  • Nursing
  • Political Science
  • Psychology
  • Public Health
  • Public Policy
  • Social Work
  • Sociology
  • Statistics

  • Administrative Data
  • Audio
  • Biomarker
  • Clinical Trials
  • Code and Syntax to apply to data
  • Content Analysis
  • Data Mining
  • Data visualization
  • EEG, Brain Imaging
  • Experiments
  • Field Research
  • Geospatial
  • Historical Methods
  • Interventions
  • Interviewing Techniques
  • Observational Techniques
  • Policy Analysis
  • Qualitative
  • Replication Data
  • Sensor Data
  • Survey Techniques (including online)
  • Teaching Packages with Data
  • Textual Analysis
  • Transactions Data
  • Video
  • Web Scraping

  • Adults and the Elderly
  • Children & Adolescents
  • Criminal Justice Populations
  • Families, Couples, & Households
  • Institutions
  • International/Cross-National/Comparative
  • State/Regional/Local
  • United States
  • Various Gender, Race & Ethnic Groups

ICPSR is frequently asked to define what types of data it will not accept. The following list outlines some of the criteria used to define data that are not in scope for ICPSR.

  • Non-Social and Behavioral Research Data: Data that cannot be connected with or used to expand upon the scientific investigation of the social and behavioral dimensions of human lives (both antecedents and consequences) will not be acquired. For example, much data in the physical sciences are out of scope for ICPSR.
  • Cost of Data: ICPSR generally does not purchase data or pass along the costs of access to proprietary data to the user community. Therefore, data with associated fees may be considered out of scope for ICPSR.
  • Limited Access Data: ICPSR generally does not accept data requiring limitations on use, with the exception of data with access conditions intended to protect the privacy and identity of study subjects. For example, ICPSR generally does not accept data where access would be conditional on publication review or authorship requirements.
  • Availability Elsewhere: ICPSR prefers to be the archive of record for a data collection. Data that are permanently available from another trusted repository may instead be linked to from the ICPSR catalog. ICPSR may accept data that are available elsewhere in cases where ICPSR can add value to the data through curation or preservation.
  • Copyright: ICPSR only accepts data when the data contributor grants ICPSR rights to curate, disseminate, and preserve a copy of the data.
  • Capacity: There may be instances where ICPSR rejects data due to technological limitations. For example, some data may require a specific proprietary software to review the data that is not currently available at ICPSR.

Historically, ICPSR has acquired and processed government data collections either with support of the ICPSR membership or through topical archives at ICPSR that make data freely available to the public. ICPSR will continue to update its government data series so that users do not encounter gaps in these important data collections. ICPSR will prioritize acquiring government data collections when it believes it can: (1) add significant value for its users, (2) ensure the long-term preservation of the data, and/or (3) add value through data curation (especially DDI) to leverage/increase the access, discoverability, and correct use of data.

For other important government collections, ICPSR may provide links to the original data sources in its catalog in order for users to access the most up-to-date government data series. ICPSR will continue to accept requests from users about government data that are difficult to locate or use, or of such high interest that their acquisition by ICPSR is justified.

ICPSR identifies high-priority data through review and analysis of user demand (user behavior, recommendations of ICPSR Council, Official Representatives, and the membership) and scanning the research landscape (review of scholarly publications, grant award databases, and trending research topics in the news). ICPSR remains flexible in identifying high-priority data as new topics emerge and become relevant and important among the ICPSR data user community. ICPSR?s goal is to be responsive to user demand for data, particularly when the topics are found to be limited in ICPSR?s current holdings. Given our commitment to DEI, ICPSR will continue to prioritize studies in which researchers have shared their results with communities from which their data were taken (i.e. shared analysis and interpretation of the data).

  1. ICPSR prefers data in a readily useable format (see the Library of Congress' Recommended Format Specifications), accessible in a variety of computing and technological settings.
  2. ICPSR prefers data formats that promote easy access and use without compromising research value.
  3. ICPSR prefers that data files deposited in a raw format be transformable or convertible into formats usable by a variety of statistical or analytical software.
  4. ICPSR prefers data files unaccompanied by value-added software.
  5. Data in obsolete, proprietary, or hard-to-use formats may still be accepted by ICPSR, although these characteristics may compromise any future use of the data other than as-is, bit-level access.

  1. ICPSR requires that data deposited in the archive meet recognized standards for privacy and confidentiality of subjects studied. (For information on these standards, see the University of Michigan’s Human Research Protection Program information).
  2. ICPSR prefers to acquire data that can reside in the public domain.
  3. ICPSR requires that data intended for public use be formatted so that identifiers inadvertently included in the data can be removed using standard practices without reducing the research value of the original data.
  4. Any access limitations that ICPSR might apply to specific data collections (e.g., a requirement that restricted-use agreements must be signed) should be legally justified and manageable given ICPSR's resources, goals, and mission.

ICPSR offers both an option of making data available to the user community in the condition deposited (no curation) along with an option for member-funded curation, which involves review, enhancement, and quality checking of the data to ensure usability and findability.

  • No Curation: The least restrictive stream of data entering ICPSR is data that receive no curation. Any depositor with data meeting the terms of ICPSR's broad Collection Development Policy may deposit and publish data in openICPSR, an open-access repository. The depositor assigns the rights to ICPSR for making decisions about curating these data for the ICPSR membership.
  • Member-Funded Curation: ICPSR also accepts and curates data that are considered to be valuable (either in the present or future) to the membership of ICPSR. There are additional selection criteria placed on data that are curated for the ICPSR membership. ICPSR can apply one of three possible curation levels to data, depending on the size, complexity, condition, and potential value of the data deposited. Potential value is determined by the following criteria:
    • Popularity: Data with evidence of significant usage will undergo review for possible member-funded curation. Data that may appeal to researchers across multiple disciplines (i.e., interdisciplinary data) will be strongly considered for curation.
    • Series: ICPSR maintains longstanding series and will continue to curate new data that are part of these series to maximize the historic investment in the data by funding agencies, data producers, researchers, and ICPSR itself.
    • Methodological Rigor: Data that are methodologically sound, including but not limited to nationally representative sampling designs, will be identified and acquired for member-funded curation. Data stemming from an ineffective or flawed research design will not be curated for the membership and instead deferred to openICPSR.
    • Scientific Reputation: Highly cited data collections, data collected by frequently cited scientists, and data resulting in high quality citations (impact) will be identified and acquired for member-funded curation.
    • Data and Documentation Quality: Quality of the data and documentation are considered when reviewing incoming data. If there is inadequate documentation and/or data are of poor quality, data will not be curated for the membership due to the higher cost of curation and prospects for more limited use. Data not meeting these criteria will be deferred to openICPSR.
    • High-Priority: Data that are in demand and/or that represent known gaps in ICPSR holdings will be acquired for curation for the ICPSR membership. ICPSR understands that new areas of research may be more experimental and as such the data might not otherwise meet the criteria for curation. ICPSR considers these high-risk, high-reward data as being worthy of curation for the membership at lower levels of quality, methodological rigor, and reputation.

ICPSR also has several grants and contracts to provide data archiving services to various research communities by creating topical archives, but each of these projects has developed its own set of selection criteria that fit within the broad Collection Development Policy of ICPSR.

ICPSR's investment in curation ensures usability, findability, and long-term access to data it anticipates being of greatest value today and into the future. Curation services include reviewing incoming data and documentation for accuracy, consistency, meaning, and ensuring that data can be understood by users who did not collect the data. The rich metadata developed by ICPSR during curation also helps in the delivery of data to the user so that data can be understood and found by the widest audience. Techniques for minimizing disclosure risk are applied to the data during curation. Curated data are preserved in an archival format to ensure long-term access as well as presented in multiple formats in use currently for easier, immediate access by users. In addition to the curation services funded by the ICPSR membership or government grants and contracts, data contributors may purchase curation services.

This policy is subject to a five-year review and re-issuance of policy by the ICPSR Council. This policy is open for review and comment by the membership at any time.

Approved date: 09/13/2021