The Undergraduate Student Research Awards (USRA) programs expose undergraduate students to research, with the goal of encouraging students to pursue graduate studies leading to research careers. Previous students have been able to link their USRA to a co-op work term; please contact the Science Co-op office for more information on this possibility. If you are assigned to a project and would like to link the USRA to a co-op work term, please check first with your supervisor. USRA research assistants are expected to satisfy the co-op requirements independently, without imposing additional work on the supervisors.
Science students have access to two programs, the Natural Sciences and Engineering Research Council (NSERC) USRA program and the SFU Vice President, Research (VPR) USRA program. NSERC USRAs are restricted to Canadian citizens and permanent residents; VPR USRAs are open to international students. Canadian citizens and permanent residents should apply to the NSERC program only. For more information, please see the SFU Dean of Graduate Studies USRA website.
The Department of Statistics and Actuarial Science is hoping to appoint up to eight students during the Summer of 2018. A preliminary list of proposed projects is given below, and may be updated in the coming weeks. We thank all applicants for their interest but request that they refrain from contacting supervisors. Supervisors will contact applicants selected for an interview in due course.
Resumption of Ovarian Function Following Parturition in a Subsistence Mayan Indigenous Population of Rural Guatemala
Supervisor: Rachel Altman
In addition to working with Rachel Altman, the chosen student will work collaboratively with an undergraduate student and faculty member in the Faculty of Health Sciences. The team will investigate the role of socio-ecological factors affecting the resumption of ovarian function following parturition in a subsistence Mayan indigenous population of rural Guatemala. Duties will include helping to conduct an extensive literature review, assisting in data management and analyses, and writing a manuscript for publication. The chosen student will contribute to all aspects of the project, but will spend proportionately more time on data analysis and presentation of results.
LDheatmap code enhancements
Supervisor: Brad McNeney
LDheatmap is an R package written and maintained by the Statistical Genetics Working Group at SFU. In the 10 years since its publication on the Comprehensive R Archive Network (CRAN), LDheatmap has been cited in 139 research papers in fields such as agriculture and medicine. Development of the package has been slow in the past few years, and the list of feature requests from users is growing. The NSERC USRA will be responsible for:
- Familiarizing themselves with the LDheatmap code,
- Compiling feature requests and providing a preliminary assessment of the feasibility of each one,
- Writing R code to implement new features,
- Documenting all work.
Improving the Hosmer-Lemeshow Goodness-of-Fit Test
Supervisor: Tom Loughin; Position filled
The real relationship from which data are drawn may not adhere to a proposed model for this structure. Goodness-of-fit (GOF) tests help to identify when a model is a poor fit for a given dataset. A special GOF test for logistic regression models, the Hosmer-Lemeshow test, is used extensively in applications in many disciplines. It has been extensively studied, and some of its properties are somewhat surprising. There is evidence that the test functions poorly under certain circumstances. We will explore the origins of some of these properties and suggest ways to improve the test based on these findings. For example, the test statistic consists of a sum of squared Pearson residuals computed on grouped data. Preliminary evidence suggests that using adjusted Pearson residuals may allow the test to maintain its size better under certain circumstances. We will examine this, and other possible corrections to the test.
LDheatmap documentation revamp
Supervisor: Brad McNeney
LDheatmap is an R package written and maintained by the Statistical Genetics Working Group at SFU. In the 10 years since its publication on the Comprehensive R Archive Network (CRAN), LDheatmap has been cited in 139 research papers in fields such as agriculture and medicine. The documentation in the package is in need of an update. First, we have received numerous suggestions from users for clarifications to the existing documentation. Second, we would like to adopt state-of-the-art tools for maintaining documentation. The NSERC USRA will be responsible for
- Familiarizing themselves with the LDheatmap code and documentation,
- Assessing the suitability of modern R documentation systems, such as roxygen2, for the needs of the LDHeatmap project,
- Porting existing documentation to the new system, and
- Keeping notes to document all changes to the package.
Supervisor: Liangliang Wang
This project involves applications of some software packages in computational biology and statistical data analysis for human microbiome data. R programming and good writing are required. Being familiar with Java is a plus but not required.
Effect of Air Pollution on Public Health
Supervisor: Jiguo Cao
Asthma is a common respiratory disease that affects the life of an increasing percentage of American people. Studies have shown that some types of air pollutants have the direct association with the asthma attack. In this project, we will use the modern artificial intelligence method to investigate the effects of daily trajectories of SO2, PM2.5, and ozone on the asthma hospitalization rate.
Genotype Imputation for Statistical Analysis of Alzheimer's Disease
Supervisor: Jinko Graham
Dementia is a general term for loss of memory and other cognitive abilities serious enough to interfere with daily life. Alzheimer's disease is the most common form of dementia, accounting for 60-80 percent of cases. The disease is progressive and there is no known cure. The greatest risk factor is increasing age. However, Alzheimer's disease is not a normal part of aging. Genetics influences our risk of developing Alzheimer's disease and could play a role in early detection. In an initial case-control study, the Alzheimer's Disease Neuroimaging Initiative (ADNI) collected information or "genotypes" on an initial set of genome-wide variation. In a second case-control study involving new subjects, ADNI collected genotypes for a different set of genome-wide variation having some overlap with the first. We would like to impute the variation from the first set of genotypes into the second set of genotypes in the second study, using the spatial correlation of variation in the genome. This project will consist of learning and documenting the imputation process for the ADNI data and the workflow to automate the imputation.
Alternatives to Gaussian Processes for Model Calibration
Supervisor: Derek Bingham
Rapid growth in computing power has improved the ability to simulate complex systems. In some applications, interest lies in combining simulations and field data (i.e., model calibration; Kennedy and O'Hagan, 2001). Gaussian processes (GPs) are used here because they (i) are a good non-parametric regression estimators; and, most importantly, (ii) they provide a foundation for statistical inference for deterministic simulators. In the latter case, since there is no randomness in the simulator observations, the predictive uncertainty is the result of possible sample paths the GP can take after conditioning on the data. In this work, we will investigate using an over-specified set of basis functions (say Legendre polynomials - e.g., see Xiu and Kamiadakis, 2003) to emulate the GP. Here, "over-specified" means the candidate set of bases can be much larger than the number observations. Model fitting will be set within a Bayesian framework, and thus the simulator can also be viewed as a random function a priori. The chosen set of bases can be as larger than the number of runs of the simulator to encourage interpolation. To fit the model, the selection of basis functions from a candidate set will be done using stochastic search variable selection (George and McCulloch, 1993) to switch between different sets of bases, of possibly different sizes - each iteration of the variable selection procedure can consider different selected basis functions from a candidate set. From a foundational view, the randomness in predictions at new locations comes from the different sets of basis functions that represent the observations - and any lack of interpolation can be attributed to the inability to resolve high frequency variation - this is a new viewpoint. The proposed work will implement this method (we have worked out the math) and compare the predictive performance of the proposed method with the GP in the model calibration context. The result of this methodology will be an approach the computationally faster than the GP, with similar predictive ability.
Dealing with Statistical Challenges in Analysis of SFU-FLIP Study Data
Supervisor: Joan Hu; Position filled
A promising strategy for reducing the incidence and severity of fall-related injuries in long-term care (LTC) is to decrease the ground surface stiffness, and the subsequent forces applied to the body parts at impact, through installation of compliant flooring that does not substantially affect balance or mobility (Lachance et al, 2016). The flooring for injury prevention (FLIP) study at SFU (Lachance et al, 2016) is a 4-year, parallel-group, 2-arm, randomised controlled superiority trial of flooring in 150 resident rooms at a LTC site. The study’s primary objective is to determine whether compliant flooring (intervention) reduces serious fall-related injuries relative to control flooring and the primary outcome is serious fall-related injury due to a fall in a study room. The study team plans to complete its primary data analysis by April 2018.
This NSERC USRA project (May 2018 – August 2018) aims to address the following statistical challenges arising from the study data analysis:
- During the study follow-up time, some study rooms had more than one resident. Instead of according to the study’s initial randomization scheme, the new resident moved in following the order of “first come first serve” when a study room became available. This may violate (i) the balance between the two arms in subject demographics by the randomization and (ii) the conventional assumption of independence among subjects.
- There were study subjects experiencing multiple injuries, some of which were of the study interest and some not. To make a fully use of the available data, instead of the conventional survival analysis approaches, approaches with recurrent events need to be considered and thus the R functions for survival analyses cannot be directly applied.
- Since this is a study with human subjects, not all the non-intervention-related conditions were controlled. There may be competing interventions. Hip protector use, for example, is common at the LTC site and was not altered for the purpose of the FLIP Study. This likely resulted in less serious fall-related injuries. For another example, some subjects did not have the anticipated follow-up time length of 4 years. One of the causes was death during the study follow-up, and this leads potentially to informative censoring for it may reduce the size of vulnerable subject group.
If you are interested in applying, please follow the procedure below:
Note: We thank all applicants for their interest but request that they refrain from contacting supervisors. Supervisors will contact applicants selected for an interview in due course.
Canadian citizens and permanent residents should apply for the NSERC USRA only; please do not apply for the VPR USRA. If you are eligible for the NSERC USRA:
- Go online at the NSERC site: www.nserc-crsng.gc.ca/Students-Etudiants/UG-PC/USRA- BRPC_eng.asp.
- Please submit the following: printed NSERC Form 202 (Part I) with your NSERC Online Reference #, attach an up to date unofficial transcript (not an advising transcript). This must be submitted to the Grad Secretary in the Department of Statstics & Actuarial Science Room SC K10547 by January 29, 2018.
- Selected students will be notified by the department. These students must verify applications online by February 3, 2018. Note that this includes uploading transcripts and having a supervisor start Form 202 Part II.
If you are only eligible for the VPR USRA:
- Go to www.sfu.ca/dean-gradstudies/awards/undergraduate-awards/sciences-awards.html and complete the student portion of the application form. In the section on Award Information, filling in the proposed supervisor is optional.
- Once you have completed the student portion, print it and attach an up to date unofficial transcript (not an advising transcript). This must be submitted to the Grad Secretary in the Department of Statistics & Actuarial Science Room SC K10547 by January 29, 2018.
- Selected students will be notified by the project Supervisors.
- Supervisors will fill out the 'Supervisor Information' and 'Research Project' sections of the form.
NSERC & VPR USRA Holders
|Sports Analytics||T. Swartz||D. Chu||Summer||NSERC USRA|
|Goodness of Fit Testing for Poisson Regression||T. Loughin||N. Surjanovic||Summer||NSERC USRA|
|Identifying and Cataloging Minimal SOS Designs||B. Tang||R. Groenewald||Summer||NSERC USRA|
|Comparison of software packages in Bayesian phylogenetics||L. Wang||G. Feng||Summer||VPR USRA|
|Inflation Modelling: Implications for Insurers and Pension Plans||J.F. Bégin||J. Tang||Summer||VPR USRA
|Sports Analytics||T. Swartz||L. Wu||Summer||VPR USRA|
|What's the successful trading rules in stock market?||J. Cao||B. Thind||Summer||VPR USRA|