近年来,复杂性疾病(例如恶性肿瘤、心脑血管疾病等)逐渐成为现代社会最主要的疾病负担,并造成巨大的健康和经济损失[1]。复杂性疾病的发生是基于生活方式、环境、遗传的相互作用而导致的,探讨此类疾病的发病机制已经成为现代病因学研究的重要课题。队列研究是流行病学最基本的分析性研究设计之一,在病因学研究中具有不可替代的地位和作用。当检验病因假说时,队列研究可以探讨有害暴露的致病作用,而基于人群构建的队列(populationbased cohort)可以用来研究多种暴露因素和多种健康结局的关系,且研究结果具有较好的外推性[2],不仅是解决现代医学一些迫切问题的重要研究手段,也是转化医学研究的重要基础[3]。
Cohort study is one of the important epidemiological methods which plays an irreplaceable status and role in etiological study. Using cohort study design, we can accurately and continuously collect genetic and environmental information, and identify and validate omics biomarkers to provide evidences for precision public health and medicine. However, results from a new cohort would not be available for at least ten years, as five years would be needed for funding, planning and enrolment, and another five for following up even the earliest analyses of the most common diseases; results for most cancers would take longer, with an unaffordable budget for many research investigators or institutions. That brings an alternative strategy of using existing cohort studies by sharing data between each other. Data sharing of cohort studies would be beneficial in many ways. Data sharing of cohort studies has the potential to make large samples unattainable in a single study, increase statistical power, enable more accurate and detailed subgroup analysis, increase the generalizability of results. It would also facilitate exchange of experiences and learning from each other, avoid for duplicated research and effectively promote the second use of existing data (i.e. using old data to discover new results). The data sharing would save staff recruitment, follow-up, laboratory analysis of the cost, with a high cost-benefit returns and economies of scale. Data sharing enables cross-validation and repeated verification across different data. Many international research funding agencies or leading research groups have also reached consensus on the principles and goals for promoting the sharing of medical research data. Due to rapid development of cohort studies in the past decades, China already has the basis for data sharing of cohort studies. Unfortunately, most of the existing cohort studies are self-contained, independent, lack of visibility, with insufficient co-operation and data sharing between each other. The academic value of the existing data collected in these cohort studies have not been fully exploited and utilized so far. Therefore, the China Cohort Consortium is trying to establish a multilevel three-dimensional cooperation and data sharing strategy. We hope that it will encourage researchers from public health, clinical and other related fields to work more closely through providing data management, data integration, data interaction, tools development, data repositories and other functions.