# Using the HTAN Data Portal

The HTAN Data Portal provides access to all open access HTAN data. To get started, go to: https://data.humantumoratlas.org/explore.

To orient you to the HTAN Data Portal, consider the example of accessing precancerous polyp data from Vanderbilt University, as described in their recent Cell publication.

By default, HTAN data is organized by research center:

HTAN Portal: Home Page
HTAN Portal: Home Page

If you scroll down on the page, you will see Vanderbilt University:

HTAN Portal: Vanderbilt Atlas
HTAN Portal: Vanderbilt Atlas

As of this writing, you can see that the Vanderbilt Colon Atlas project has 90 cases and 193 biospecimens.

# Downloading Metadata

Once you have identified the project of interest, you can click the download metadata button:

HTAN Portal: Download Metadata
HTAN Portal: Download Metadata

You will then be prompted with a dialog box of all metadata associated with the specified project. For example:

HTAN Portal: Metadata Table
HTAN Portal: Metadata Table

Behind the scenes, HTAN leverages the Synapse Platform created and maintained by Sage Bionetworks. Each piece of HTAN data is automatically assigned a unique Synapse identifier, such as syn25010909. In the screenshot above, you can see that the Vanderbilt project has multiple metadata files, each associated with a unique Synapse identifier.

If you click on any of the Synapse links above, you can immediately download a comma separated value (CSV) file associated with the metadata category. There is no need to create a Synapse account or log into Synapse. For example, here we have download the Vanderbilt biospecimen file and loaded it into Excel:

HTAN Tabular Data within Excel
HTAN Tabular Data within Excel

Once you have downloaded metadata files, you can parse them in your favorite programming language, such as R or Python. To understand the individual columns within each metadata file, please refer to the HTAN Data Model.

# Exploring Available Data

The HTAN Data Portal provides a unified interface for filtering and exploring HTAN data sets. Each filter is available at the top of the page:

HTAN Portal:  Filters
HTAN Portal: Filters

To get started, you can click the Atlas pull-down menu, and select the Vanderbilt HTAN center:

HTAN Portal:  Filter by HTAN Center
HTAN Portal: Filter by HTAN Center

Your selection will now be reflected in the user interface:

HTAN Portal:  Filter by Vanderbilt University
HTAN Portal: Filter by Vanderbilt University

If you click the Cases or Biospecimens tabs, you can browse available metadata. Clicking the Files tab will take you to an interactive table listing all files available for download.

HTAN Portal:  Files Tab
HTAN Portal: Files Tab

At this point, the Files tab is likely to contain hundreds of files, and may be difficult to navigate. You can further refine the files table by clicking on the Assay Type or File Type filters. This will trigger pop-up windows that describe the assay and file type categories available within the Vanderbilt project. For example, if you click Assay Type you will see:

HTAN Portal:  Filter by Assay Type
HTAN Portal: Filter by Assay Type

Data that is available within the Vanderbilt project is set to bold. You can therefore see that the Vanderbilt project has Bulk DNA, H&E Images, Multiplex ImmunoFluorescence images and Single Cell RNA Seq Data.

If you click scRNA-seq, the file table will automatically update. You can then select the File Type filter to drill-down even further:

HTAN Portal:  Filter by File Type
HTAN Portal: Filter by File Type

Clicking Level 4 here will now filter the File table to only include Level 4 sequencing data that consists of Single Cell RNA Seq h5ad formatted files:

HTAN Portal:  Multiple Filters Enabled
HTAN Portal: Multiple Filters Enabled

Note that you can remove any existing filters by clicking on any of the “chips” in the page header. For example, if you want to remove the Level 4 filter, just click the Level 4 chip:

HTAN Portal:  Removing Filters
HTAN Portal: Removing Filters

Clicking View Details on any of these files will pop open a metadata table. For example:

HTAN Portal:  Metadata Details
HTAN Portal: Metadata Details

# Downloading Open Access Data

Once you have specified your filter criteria, the Files tab will display all matching files. At this point, you may see two types of files:

  • Open Access Files; and
  • Access Controlled Files

Once registered and logged into Synapse, you can then click through from the HTAN Data Portal to the corresponding page on Synapse. For example, this CSV file corresponds to single cell data generated by the Vanderbilt project:

Synapse Page
Synapse Page

You can click the Download Options button to download the file. To retrieve the same file via R, Python or via the command line, click the Download Options button and select Programmatic Options.

To download access-controlled files, please refer to Access Controlled Data Access.