Salma Salhi, an MSc student at IREx, recently completed her degree at the Université de Montréal. Here, she summarizes her research project
The James Webb Space Telescope has already massively revolutionized the science we can do, especially in the field of exoplanets. Its infrared instruments, like the Near Infrared Instrument and Slitless Spectrograph (NIRISS), have been especially useful for surveying habitable world candidates. However, one current major setback with NIRISS is the presence of a noise called “1/f noise”, which increases with longer wavelengths (or smaller frequencies). Even the best methods of correcting this noise leave error bars that are larger than they should be.
The goal of my Master’s project was to address this problem from a new perspective by using real data to understand the structure of this noise. Most conventional methods of correcting 1/f noise rely on “deterministic” approaches. In the NIRISS image of a spectrum—the amount of light received at each wavelength—of a star as a planet passes in front of it, we calculate the median pixel value for each column of pixels and then subtract that median value from all pixels in the column. While this does eliminate some of the 1/f noise, it still leaves residuals on the image because it assumes the noise is entirely predictable, with no element of randomness down the column, which is not the case. A probabilistic method of correction is a better approach. This noise does not have the “classical” shape we often expect (i.e. bell-shaped, or Gaussian), but it has some stochasticity, or randomness. This randomness can be described by what we call a probability distribution. If we understand how the noise is distributed, we can combine what we observe with what we already know about the noise, in a framework known as Bayesian inference, to estimate a distribution of the original, noise-free image. The shape of this distribution can give us robust uncertainty constraints on our estimate of the underlying image.
Since 1/f noise is non-Gaussian, we cannot describe it with a simple formula. Instead, we need some kind of algorithm or model that can determine its distribution using data examples of the noise. A machine learning approach is well-suited for this goal. A neural network, which is a type of computer model that learns patterns from data, can be trained on real images of the noise to recognize and learn its distribution directly from the data. To this end, I trained a type of neural network called a score-based diffusion model on dark images, which contain only 1/f and other instrumental noise, without any signal. I also trained a separate model on simulated images of the noiseless signal. Combining these two models in a Bayesian framework allowed me to generate the final distribution of possible realizations of clean, noiseless images.
This method enables us to achieve the highest possible precision on simulated data, which we refer to as “photon-limited” precision, and eliminates the issue of large error bars. During my PhD, I plan to apply it to real observations. I anticipate it will be particularly useful for studying rocky planets, where high precision on atmospheric detections is especially important.
Salma completed his MSc degree between 2023 and 2025, under the supervision of IREx professor René Doyon and of Laurence Perreault-Levasseur. Her thesis will soon be available on Papyrus.