TCIA Query Tool
This page describes the TCIA Query Tool and gives instructions for using it. Users can search the NLST database to identify subsets of the NLST population that are relevant to their research questions. They can then download CT images, view pathology images, and download data from those sub-populations.
The Query Tool is a Web-based application developed and administered by The Cancer Imaging Archive (TCIA) group at Washington University in St. Louis, where the images are stored.
How to Launch the Query Tool
- If your research team has not already done so, then you must submit a project proposal through CDAS. A single request covers a team of “Approved Users”.
Follow these steps to launch the Query Tool AFTER the request process is complete (submitted, approved, DTA, “Delivered”).
- Go to My Projects. You will be prompted to log in if you have not already logged in.
- Click on the “Browse NLST Images” button in the upper right corner of the window.
- The Query Tool will then launch as a tab in your web browser.
If you experience any difficulties, contact CDAS for assistance.
Restrictions (Max = 3,000 CT participants): The images are to be utilized only in conjunction with the approved request. Having an approved project does not entitle one to download the entire CT image data base, or large portions of it, for future un-specified projects. As a precaution against such misuse, there is a default limit of 3,000 subjects from which CT images can be downloaded for any approved project. Investigators wishing to exceed this limit should submit a request through CDAS explaining the need for additional images. These requests will be evaluated by NCI, with approval at the discretion of NCI.
Alternatives: Copies of CT images and/or pathology images may be obtained on electronic media (e.g., external hard drive) by contacting CDAS. CDAS staff will coordinate your request with TCIA. The CT images are in DICOM format, and the pathology images are in Aperio SVS format.
Capabilities of the Query Tool
The TCIA Query Tool is a Web-based application that enables users to select sets of NLST data and/or screening CT images that are relevant to their research questions and download the selected images and associated data. Pathology images of lung cancer tissue may also be viewed (but not downloaded). Data available for identifying images include:
- Demographics (gender, age, race, etc.)
- Risk factors (smoking history, disease history, occupation, etc.)
- Lung cancer diagnosis
- Results of screening exam
- Lung abnormalities observed by the radiologist
- Imaging technical parameters
- Scanner manufacturer and reconstruction filter
- Cause of death (if deceased)
The tool returns a list of all image sets or participants that meet the criteria specified.
The Query Tool can also be used to download the data only, without downloading any images.
Note that the NLST data available through the Query Tool are comprehensive. With a few exceptions, the Query Tool data are equivalent to the datasets available through CDAS. See list of exceptions.
How to build a query
You can find detailed instructions in the Help menu of the Query Tool application, including a user’s manual, a tutorial, and a data dictionary. Basic instructions are below.
When you query the NLST database, you retrieve a subset of the database, i.e. certain columns and rows. Rows represent people (usually), and columns represent attributes of those people (e.g. gender, age). The query tool lets you specify which columns and rows to retrieve and then to view and/or download the retrieved data. Here are the basic steps.
Choose columns to display by clicking on column names in the ”Select Returned Values” tab. Some examples are gender, age, pack years smoked, and lung cancer status.
- On the “Select Returned Values” tab, click on the triangle beside each table name (e.g. “Demographics”) to expand and view the column names.
- Click on the names of the columns you want to view in your retrieved results.
- The column names will be displayed under the word “Selected”.
Specify your population of interest (i.e. which rows) by specifying certain conditions (e.g. gender=Female).
- Click on column names on the “Add Constraints” tab.
Click to specify values of interest on the “Selected” part of the tab using these operators:
- = (equals): match a single value.
- in: match a list of values. Use ctrl + click or shift + click to specify multiple values.
- <> (not equals): match all values except the value specified.
- <, <=, >, >= : standard less-than / greater-than operators.
- Between: match any value between two values that you specify (inclusive).
- Run the query and view the results by clicking on the “Run Query / View Results” tab.
It’s easy to modify a query and re-run it. Some things you can do:
- Add columns and/or conditions by clicking on the “Select Returned Values” and/or “Add Constraints” tabs.
- Remove columns and/or conditions by clicking the “Remove” button by the column name in the “Select Returned Values” or “Add Constraints” tab.
- Change your population (rows) by changing the conditions on the “Add Constraints” tab.
- Run the query again by clicking the “Run Query/ View Results” tab.
Cautions for queries
To correctly interpret query results, you need to understand a few things. If your query uses only columns from the first nine tables listed in the query tool, then each resulting row will represent one person, and the results will include all NLST participants that meet the conditions you specify. However, if you use columns from the last eight tables (see below), then your results will include multiple rows per person, and the population may be restricted. The table below describes which database tables cause these scenarios.
|Name of Database Table||What does a row represent?||Population restriction|
|ScreeningResults||Screening round||No restriction|
|PositiveScreenFollowupProcs||Screening round||No restriction|
|IMSDerivedSCTScreenVars||Screening round||CT arm|
|SCTImageInfo||CT image series||CT arm with image series|
|LSSPathTumor||Participant||LSS Pathology participants (463)|
|LSSPathDonorBlock||Donor block (and image)||LSS Pathology donor blocks (1254)|
|LSSPathRegOfInterest||Region of interest||LSS Pathology regions of interest (2522)|
|LSSPathTMACore||TMA core||LSS Pathology TMA Cores (7596)|
If you include more than one of these tables in a query, your results will be as follows:
- Only rows that meet ALL population restrictions will be included.
- The resulting rows will represent all possible combinations of rows from the input tables.
How to get CT images
The steps required to download the CT images are different depending on whether you identify your sub-population using the Query Tool or by using the full datasets from CDAS. Here are the details for each method:
Use Query Tool to identify population
- Launch the Query Tool.
- Build a query with conditions that define your population of interest.
- Run the query.
Click “Download Associated CT Images.” This starts the launch process for the NBIA Download Manager application (which requires Java on your computer to run).
- A file named “main.xhtml” will be downloaded to your web browser. Open this file. It is a JNLP (Java Network Launch Protocol) file.
- Java will launch on your computer.
- The application will be downloaded to your computer.
- You will be prompted whether you want to run the application. Click “Run.”
- The Download Manager window will then appear with a list of the images to be downloaded. Select the directory location where you want the images stored and click the Start button. The images will begin to download to your computer, and progress bars will display the percent completed.
Use datasets from CDAS to identify population
- Download the datasets from CDAS from the Deliverables tab of your request.
- Use software (e.g., SAS or R) to define your population of interest.
- Contact CDAS to coordinate transfer of images from TCIA to investigator on investigator-provided media (e.g. external hard drive).
How to view pathology images
The Query Tool allows viewing of pathology images from 463 LSS participants with lung cancer. The images are of thin slides cut from chunks of lung tissue (“donor blocks”) which include both tumor tissue and normal lung tissue. Here are the steps for viewing the images:
- Build a query to identify the people whose pathology images you wish to view.
- In the query, you must include the variable PATHOLOGY_IMAGE from the LSSPathDonorBlock table.
- Run the query.
- In the results tab, you will see buttons labeled “ViewPathologyImage” in the rightmost column. Click on one of those buttons. The image viewer (caMicroscope) will then launch and display the image in your web browser.
- You can zoom using the mouse’s scroll wheel, and drawing tools are available for you to make annotations. Any annotations you make will automatically be saved and will be displayed if you view that image again in future sessions.
If you require copies of the pathology images, they can be obtained in Aperio SVS format. Contact CDAS to request the pathology images on electronic media (e.g., external hard drive).
More information about the pathology images and data can be found in the Query Tool’s help menu and in the pathology data dictionary on TCIA's wiki page.