#
Data Model Introduction
All HTAN Centers are required to encode their data and metadata in a common HTAN Data Model. The HTAN Data Model is created via a community Request for Comment (RFC) process, with participation from all HTAN Centers, and covers clinical, biospecimen, genomic, transcriptomic, proteomic, imaging and spatial profiling data.
HTAN has had two phases:
- HTAN Phase 1 (2018-2025)
- HTAN Phase 2 (2025-present)
#
HTAN Phase 1 Data Model
HTAN Phase 1 Data Model Github Repository
Where possible, the HTAN Phase 1 Data Model leveraged previously defined data standards across the scientific research community, including the NCI Genomic Data Commons, the Human Cell Atlas, the Human Biomolecular Atlas Program (HuBMAP) and the Minimum Information about Tissue Imaging (MITI) reporting guidelines.
#
HTAN Phase 2 Data Model
HTAN Phase 2 Data Model Github Repository
In HTAN Phase 2, the Data Model is being updated with three main aims:
- Align the HTAN Data Model with the NCI's Cancer Research Data Commons (CRDC) standards.
- Add or enhance existing standards to accommodate new assays.
- Refine and strengthen the model to support FAIR data sharing principles, by
- eliminating unused attributes;
- changing requirements and valid values;
- clarifying attribute definitions; and
- standardizing data file formats where possible.
Data Standards
Complete information regarding the HTAN Phase 1 Data Model and specific data elements is available at: https://data.humantumoratlas.org/standards.
This manual and the HTAN Data Portal standards pages will be updated as the Phase 2 Data Model is implemented.