Thumbnail Image


Publication or External Link





The application of advanced statistical methods to astrophysical problems is

desirable for reasons of time-efficiency, and robustness. A data-driven approach,

when combined with physical insights, can expedite solutions to difficult problems,

where data is aplenty, however, physical insights may be nebulous. This may be

either due to the parametric complexities of the models assumed, or the inherent

complexity in the behavior of the astrophysical system itself. In this thesis we

demonstrate that, via the application of a variety of statistical tools to the Pan-

STARRS1 medium-deep survey data, we solve two important classification problems

faced by the survey.

The Pan-STARRS1 (PS1) Survey is unique in terms of its temporal, spa-

tial, and wavelength coverage, permitting extensive studies on known astrophysical

sources such as active galactic nuclei (AGN) and supernovae (SNe), as well as ex-

otic ones, such as tidal disruption events and recoiling supermassive black hole

binaries. The Medium-Deep (MD) survey in particular offers a time resolution onthe order of a few days over 10 distinct 8 sq. deg. fields, or over 80 sq. deg. of

sky, and with the technique of difference imaging, enables the detailed study of

stochastic variations and explosive transients associated with extragalactic sources.

In the first of two parts of this thesis, I outline a novel method for the light-curve

characterization of Pan-STARRS1 Medium-Deep Survey (PS1 MDS) extragalactic

sources into stochastic variables (SV) and burst-like (BL) transients, using multi-

band difference-imaging time-series data. Using a combination of Bayesian leave-

out-one-cross-validation and corrected-Akaike information criteria to model time-

series in the four PS1 photometric bands g P 1 , r P 1 , i P 1 , and z P 1 , we use a k-means

clustering decision algorithm to classify sources as bursting or stocastically variable

with over 91% purity, based on spectroscopically confirmed AGN and SN verification

samples. The performance of our classifier is comparable to the best among existing

methods in terms of purity. We use our method to classify 4361 difference image

sources with galaxy hosts in the PS1 MD fields as BL or SV, and then together

with their host galaxy offsets, create a robust sample of AGN and SNe. From these

variability-selected samples, we derive photometry and variability based priors that

can be used in future survey data streams for near real-time classification.

In the second part, I discuss the applications of a genetic algorithm optimized

support vector machines or GA-SVM, machine learning classifier and regression

tool, we developed to solve two important problems in astronomical surveys; a.

star-galaxy classification where we show as proof of concept, the efficient separation

of 11000 stars and galaxies in the MD fields using 32 photometric parameters de-

rived from the PS1 MD stack [1]; and b. photometric redshift regression, where asproof of concept we predict with high accuracy, the photometric redshifts of 5000

galaxies in the COSMOS survey, based on 25 photometric parameters derived from

the survey. We show that our GA-SVM method is more efficient as compared to ex-

isting methods for star-galaxy classification, and more robust than existing methods

for photometric redshift estimation.