APPLICATIONS OF ADVANCED STATISTICAL METHODS IN THE PAN-STARRS1 MEDIUM-DEEP SURVEY
Files
Publication or External Link
Date
Authors
Advisor
Citation
DRUM DOI
Abstract
The application of advanced statistical methods to astrophysical problems is
desirable for reasons of time-efficiency, and robustness. A data-driven approach,
when combined with physical insights, can expedite solutions to difficult problems,
where data is aplenty, however, physical insights may be nebulous. This may be
either due to the parametric complexities of the models assumed, or the inherent
complexity in the behavior of the astrophysical system itself. In this thesis we
demonstrate that, via the application of a variety of statistical tools to the Pan-
STARRS1 medium-deep survey data, we solve two important classification problems
faced by the survey.
The Pan-STARRS1 (PS1) Survey is unique in terms of its temporal, spa-
tial, and wavelength coverage, permitting extensive studies on known astrophysical
sources such as active galactic nuclei (AGN) and supernovae (SNe), as well as ex-
otic ones, such as tidal disruption events and recoiling supermassive black hole
binaries. The Medium-Deep (MD) survey in particular offers a time resolution onthe order of a few days over 10 distinct 8 sq. deg. fields, or over 80 sq. deg. of
sky, and with the technique of difference imaging, enables the detailed study of
stochastic variations and explosive transients associated with extragalactic sources.
In the first of two parts of this thesis, I outline a novel method for the light-curve
characterization of Pan-STARRS1 Medium-Deep Survey (PS1 MDS) extragalactic
sources into stochastic variables (SV) and burst-like (BL) transients, using multi-
band difference-imaging time-series data. Using a combination of Bayesian leave-
out-one-cross-validation and corrected-Akaike information criteria to model time-
series in the four PS1 photometric bands g P 1 , r P 1 , i P 1 , and z P 1 , we use a k-means
clustering decision algorithm to classify sources as bursting or stocastically variable
with over 91% purity, based on spectroscopically confirmed AGN and SN verification
samples. The performance of our classifier is comparable to the best among existing
methods in terms of purity. We use our method to classify 4361 difference image
sources with galaxy hosts in the PS1 MD fields as BL or SV, and then together
with their host galaxy offsets, create a robust sample of AGN and SNe. From these
variability-selected samples, we derive photometry and variability based priors that
can be used in future survey data streams for near real-time classification.
In the second part, I discuss the applications of a genetic algorithm optimized
support vector machines or GA-SVM, machine learning classifier and regression
tool, we developed to solve two important problems in astronomical surveys; a.
star-galaxy classification where we show as proof of concept, the efficient separation
of 11000 stars and galaxies in the MD fields using 32 photometric parameters de-
rived from the PS1 MD stack [1]; and b. photometric redshift regression, where asproof of concept we predict with high accuracy, the photometric redshifts of 5000
galaxies in the COSMOS survey, based on 25 photometric parameters derived from
the survey. We show that our GA-SVM method is more efficient as compared to ex-
isting methods for star-galaxy classification, and more robust than existing methods
for photometric redshift estimation.