Thumbnail Image
Publication or External Link
Kumar, Sidharth
Gezari, Suvi
The application of advanced statistical methods to astrophysical problems is desirable for reasons of time-efficiency, and robustness. A data-driven approach, when combined with physical insights, can expedite solutions to difficult problems, where data is aplenty, however, physical insights may be nebulous. This may be either due to the parametric complexities of the models assumed, or the inherent complexity in the behavior of the astrophysical system itself. In this thesis we demonstrate that, via the application of a variety of statistical tools to the Pan- STARRS1 medium-deep survey data, we solve two important classification problems faced by the survey. The Pan-STARRS1 (PS1) Survey is unique in terms of its temporal, spa- tial, and wavelength coverage, permitting extensive studies on known astrophysical sources such as active galactic nuclei (AGN) and supernovae (SNe), as well as ex- otic ones, such as tidal disruption events and recoiling supermassive black hole binaries. The Medium-Deep (MD) survey in particular offers a time resolution onthe order of a few days over 10 distinct 8 sq. deg. fields, or over 80 sq. deg. of sky, and with the technique of difference imaging, enables the detailed study of stochastic variations and explosive transients associated with extragalactic sources. In the first of two parts of this thesis, I outline a novel method for the light-curve characterization of Pan-STARRS1 Medium-Deep Survey (PS1 MDS) extragalactic sources into stochastic variables (SV) and burst-like (BL) transients, using multi- band difference-imaging time-series data. Using a combination of Bayesian leave- out-one-cross-validation and corrected-Akaike information criteria to model time- series in the four PS1 photometric bands g P 1 , r P 1 , i P 1 , and z P 1 , we use a k-means clustering decision algorithm to classify sources as bursting or stocastically variable with over 91% purity, based on spectroscopically confirmed AGN and SN verification samples. The performance of our classifier is comparable to the best among existing methods in terms of purity. We use our method to classify 4361 difference image sources with galaxy hosts in the PS1 MD fields as BL or SV, and then together with their host galaxy offsets, create a robust sample of AGN and SNe. From these variability-selected samples, we derive photometry and variability based priors that can be used in future survey data streams for near real-time classification. In the second part, I discuss the applications of a genetic algorithm optimized support vector machines or GA-SVM, machine learning classifier and regression tool, we developed to solve two important problems in astronomical surveys; a. star-galaxy classification where we show as proof of concept, the efficient separation of 11000 stars and galaxies in the MD fields using 32 photometric parameters de- rived from the PS1 MD stack [1]; and b. photometric redshift regression, where asproof of concept we predict with high accuracy, the photometric redshifts of 5000 galaxies in the COSMOS survey, based on 25 photometric parameters derived from the survey. We show that our GA-SVM method is more efficient as compared to ex- isting methods for star-galaxy classification, and more robust than existing methods for photometric redshift estimation.