Detecting lung cancer trends by leveraging real-world and internet-based data: Infodemiology study

Chenjie Xu, Hongxi Yang, Li Sun, Xinxi Cao, Yabing Hou, Qiliang Cai, Peng Jia, Yaogang Wang

Research output: Contribution to journalArticleAcademicpeer-review

2 Citations (Scopus)
219 Downloads (Pure)


Background: Internet search data on health-related terms can reflect people’s concerns about their health status in near real time, and hence serve as a supplementary metric of disease characteristics. However, studies using internet search data to monitor
and predict chronic diseases at a geographically finer state-level scale are sparse.

Objective: The aim of this study was to explore the associations of internet search volumes for lung cancer with published cancer incidence and mortality data in the United States.

Methods: We used Google relative search volumes, which represent the search frequency of specific search terms in Google. We performed cross-sectional analyses of the original and disease metrics at both national and state levels. A smoothed time series of relative search volumes was created to eliminate the effects of irregular changes on the search frequencies and obtain the long-term trends of search volumes for lung cancer at both the national and state levels. We also performed analyses of decomposed Google relative search volume data and disease metrics at the national and state levels.

Results: The monthly trends of lung cancer-related internet hits were consistent with the trends of reported lung cancer rates at the national level. Ohio had the highest frequency for lung cancer-related search terms. At the state level, the relative search volume was significantly correlated with lung cancer incidence rates in 42 states, with correlation coefficients ranging from 0.58 in Virginia to 0.94 in Oregon. Relative search volume was also significantly correlated with mortality in 47 states, with correlation coefficients ranging from 0.58 in Oklahoma to 0.94 in North Carolina. Both the incidence and mortality rates of lung cancer were correlated with decomposed relative search volumes in all states excluding Vermont.

Conclusions: Internet search behaviors could reflect public awareness of lung cancer. Research on internet search behaviors could be a novel and timely approach to monitor and estimate the prevalence, incidence, and mortality rates of a broader range of cancers and even more health issues.
Original languageEnglish
Article numbere16184
Pages (from-to)1-11
Number of pages11
JournalJournal of medical internet research
Issue number3
Publication statusPublished - 12 Mar 2020



Fingerprint Dive into the research topics of 'Detecting lung cancer trends by leveraging real-world and internet-based data: Infodemiology study'. Together they form a unique fingerprint.

Cite this