This KeenerKo_DATASET_Readme.txt file was generated on 2024-12-02 by Dennis Ko GENERAL INFORMATION 1. Title of Dataset: Data from: Human genetic variation reveals FCRL3 is a lymphocyte receptor for Yersinia pestis 2. Author Information A. Principal Investigator Contact Information Name: Dennis Ko Institution: Department of Molecular Genetics & Microbiology and Department of Medicine Address: 213 Research Drive | Box 3053 DUMC | Durham, N.C. 27710 Email: dennis.ko@duke.edu 3. Date of data collection (single date, range, approximate date): 2007-2024 4. Geographic location of data collection : Seattle, WA and Durham, NC using LCLs collected as part of the International HapMap Project and 1000 Genomes Project 5. Information about funding sources that supported the collection of the data: These data are generated with support from R01AI118903 SHARING/ACCESS INFORMATION 1. Licenses/restrictions placed on the data: CCO 2. Links to publications that cite or use the data: These data are part of "Keener et al. 2024. Human genetic variation reveals FCRL3 is a lymphocyte receptor for Yersinia pestis" 3. Links to other publicly accessible locations of the data: NA 4. Links/relationships to ancillary data sets: NA 5. Source Data: HapMap Project: https://ftp.ncbi.nlm.nih.gov/hapmap/; 1000 Genomes Project: https://ftp.1000genomes.ebi.ac.uk/vol1/ftp/ 6. Recommended citation for this dataset: Keener, R., Wang, L. and Ko, D. (2024). Data from: Human genetic variation reveals FCRL3 is a lymphocyte receptor for Yersinia pestis. Duke Research Data Repository. https://doi.org/10.7924/r43n2d008. DATA & FILE OVERVIEW 1. File List: GWAS summary statistics for Y. pestis invasion in 961 LCLs generated using QFAM-parents in PLINK with adaptive permutation. 2. Relationship between files, if important: NA 3. Additional related data collected that was not included in the current data package: See the manuscript Keener et al. 2024 4. Are there multiple versions of the dataset? No METHODOLOGICAL INFORMATION 1. Description of methods used for collection/generation of data: For details of methods refer to manuscript Keener et al. 2024. Hi-HoST screening of 961 LCLs from parent-offspring trios for Y. pestis occurred in two large sets. In one, Y. pestis invasion was measured in 527 LCLs from four population in the 1000 Genomes Project (34): ESN (Esan in Nigeria), GWD (Gambians in Western Divisions in The Gambia), IBS (Iberian Population in Spain), and KHV (Kinh in Ho Chi Minh City, Vietnam). To these LCLs, we added 434 LCLs from four population in the HapMap project: CEU (Utah residents with ancestry from northern and western Europe), YRI (Yoruba in Ibadan, Nigeria), CHB (Han Chinese in Beijing, China), and JPT (Japanese in Tokyo, Japan) (35). For all 961 LCLs, we used flow cytometry to quantify GFP+ host cells, which contain viable GFP-tagged Y. pestis. Each LCL was measured on three sequential passages and the phenotype used for GWAS was calculated as the mean measurement of these three independent assays. 2. Methods for processing the data: For details of methods refer to manuscript Keener et al. 2024. Genotypes were obtained from HapMap r28 and 1000 Genomes Project Phase 3 with imputation using 1000 Genomes Project Phase 3. Filters included minor allele frequency (MAF) < 0.01, SNP missingness of > 0.2 and sample genotype missingness of > 0.2, resulting in a total of 15213612 SNPs for subsequent analysis. Genome-wide association analysis was carried out using the QFAM-parents approach in PLINK v1.9 (5, 36) with adaptive permutations ranging from 1000 to a maximum of 1x10^9. 3. Instrument- or software-specific information needed to interpret the data: text file 4. Standards and calibration information, if appropriate: No. 5. Environmental/experimental conditions: LCLs were maintained in lab at 37˚C in a 5% CO2 atmosphere and were grown in RPMI 1640 media (Invitrogen) supplemented with 10% fetal bovine serum (FBS), 2 mM glutamine, 100 U/ml penicillin-G, and 100 mg/ml streptomycin. 6. Describe any quality-assurance procedures performed on the data: NA 7. People involved with sample collection, processing, analysis and/or submission: Data generation and processing conducted by Rachel Keener, Liuyang Wang, and Dennis Ko DATA-SPECIFIC INFORMATION FOR: KeenerKo_HH3_SummaryStats_Ypestis4hrstringent.txt Permutation results file generated using QFAM-parents in PLINK 1. Number of variables: 1 2. Number of cases/rows: 15213612 3. Variable List: The file contains following columns Line number POS Chromosomal position CHR Chromosome SNP SNP ID BETA Regression slope for real data EMP_BETA Sample mean of permutation regression slopes EMP_SE Sample stdev of permutation regression slopes. EMP1 Empirical p-value NP Number of permutations performed