A simple Introduction to Gaussian Processes

A full scope description of GPs is well beyond a short presentation. For a comprehensive introduction to the theory, I recommend Rasmussen & Williams (2006), which you can dowload freely. Instead I will show a qualitative motivation for using GPs. This is when you have some correlated noise in the data and you would like to model this, in order not to introduce any biases in your parameter estimation or underestimate the error bars. Other approaches to this problem include Carter & Winn (2009)'s Wavelet Transform method. Both of these methods have their advantages. For implementation of the latter please see Szilard, who uses them to great effect.

1. The data

I will simulate some data that includes Poisson noise, as well as some residual systematics and correlate with data points. The effect of this correlated noise is, by definition, impossible to fully grasp, but ignoring it will introduce significant biases into your parameter inferences.

As a base model, I choose a single Gaussian feature with 3 parameters: amplitude $\alpha$, centre $\ell$ and the variance $\sigma^2$. You can think of this as an abosption line or an exoplanet transit :-)

In [1]:
#import the necessary python modules
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
from matplotlib import *
plt.rc('text', usetex=True)
matplotlib.rc('font', family='serif')

import george
from george import kernels
In [2]:
#define the base model, a single gaussian
def model(params, t):
    amp, loc, sig2 = params
    return amp * np.exp(-0.5 * (t - loc) ** 2 / sig2)
In [3]:
#define a function that will generate our synthetic data
def generate_data(params, N, rng=(-5, 5)):
    gp = george.GP(0.1 * kernels.ExpSquaredKernel(3.3)) #create red noise from george.GP class (later explained)
    t = rng[0] + np.diff(rng) * np.sort(np.random.rand(N)) #create unevenly sampled x values
    y = gp.sample(t)                                    #create correlated noise values by sampling the gp object
    y += model(params, t)                               #add values from the model
    yerr = 0.05 + 0.05 * np.random.rand(N)              #create Poisson (white) noise with some arbitrary chars.
    y += yerr * np.random.randn(N)                      #add white noise to the data with correlated noise
    return t, y, yerr
In [4]:
#now lets generate the data
np.random.seed(1234)                  #seed the generator in order to generate the same data every time the code runs
truth = [-1.0, 0.1, 0.4]              #model parameters used to simulate data (=> true)
t, y, yerr = generate_data(truth, 50) #generate the data     
rcParams['figure.figsize'] = 10, 7
plt.figure(1)
plt.errorbar(t, y, yerr=yerr, fmt=".k", capsize=0)
plt.ylabel(r"$y$",fontsize=20)
plt.xlabel(r"$t$",fontsize=20)
plt.xlim(-5, 5)
plt.title("synthetic data")
#plt.savefig('synthetic_data.pdf', dpi=150)   #uncomment if you would like to save the plot
plt.show()

You will notice that I have drawn the red noise from a Guassian Process. Therefore in what will follow, I will use a different kernel to mitigate this fact. Please bare in mind that the model parameters used and therefore our target values are:

$$ \alpha = -1,~~~~~~ \ell = 0.1,~~~~~~ \sigma^2=0.4 $$

1.1 Bayesian framework

Before starting the analysis I would like to refresh my memory of the Byesian framework in which we will be working. That is:

$$ {\tt ~P ( ~pars ~| ~data(y,t) , ~noise) \propto ~P(pars) \times ~P( ~y| ~t, ~noise, ~pars)} ~~~~~~~~~~~(1)$$

Namely, our prediction of what we are looking for, is dictated by our prior knowledge of the parameters and the likelihood of the outcome (posterior probability = prior probability $\times$ the likelihood function / maginal likelihood.)

This is also written as log posterior = log prior + log likelihood - log(marg. liklihood)

2. The White noise case

Now lets consider the standard case, that we are assuming only white or Poisson noise in our data. Therefore, the log-likelihood (standard practice to avoid negative prob and changing multipication to addition) function of the data, given the parameters $\theta$ is:

$$ \ln P\left({y_n}|{t_n},{\sigma_n^2},\theta\right) = -\frac{1}{2} \sum_{n=1}^N \frac{[y_n-f_{\theta}(t_n)]^2}{\sigma_n^2} + \text{const} ~~~~~~~~~~~(2)$$

Looking at the data, it seems that there is a systematic trend, and subsequently we include this in our descriptive model to be fitted to the data. This is quite common in transit light curves due to airmass variations. Therefore, our final analytical model is:

$$f_{\theta}(t) = mt + b + \alpha \exp\left(-\frac{[t-\ell]^2}{2\sigma^2}\right) ~~~~~~~~~~~(3)$$

where $\theta$ is the 5-dimensional parameter vector $\{ m,b,\alpha,\ell,\sigma^2\}$.

In [5]:
#Defining the likelihood function which also includes a linear trend
def lnlike_ind(p, t, y, invar):
    m = model(p[2:], t) + p[0] * t + p[1]
    return -0.5 * np.sum((y - m) ** 2 * invar)       #invar = 1/sigma**2

Now to fit this model using MCMC, we also have to define our priors. Depending your situations, you might want to define these based on some previous knowledge of your parameters. Here, I will just assume simple box priors.

In [6]:
#Priors of the base model
def lnprior_base(p):
    amp, loc, sig2 = p
    if not -10 < amp < 10:
        return -np.inf
    if not -5 < loc < 5:
        return -np.inf
    if not 0 < sig2 < 3.0:
        return -np.inf
    return 0.0
#add the priors of the linear trend to the prior function of the base model
def lnprior_ind(p):
    m, b = p[:2]
    if not -10 < m < 10:
        return -np.inf
    if not -10 < b < 10:
        return -np.inf
    return lnprior_base(p[2:])

We also need to define the Posterior function:

In [7]:
#Defining the posterior probability distribution
def lnprob_ind(p, t, y, invar):
    lp = lnprior_ind(p)
    if not np.isfinite(lp):
        return -np.inf
    return lp + lnlike_ind(p, t, y, invar)#note that I do not include the log(marg. likelihood)

Now that the model is implemented, we run the MCMC. Namely we initialize the walkers and run both a burn-in and a production chain. In reality, you should run multiple, independent chains and check their consistency and mutual convergence.

In [8]:
import emcee
initial = np.array([0, 0, -1.0, 0.1, 0.4])   #initial guess of the inferred parameters.
ndim = len(initial)
nwalkers = 32                                #number of walkers.
data = (t, y,1/yerr**2)                      #need to put data in a tuple.
p0 = [np.array(initial) + 1e-8 * np.random.randn(ndim)
      for i in xrange(nwalkers)]             #initialize walkers.
sampler = emcee.EnsembleSampler(nwalkers, ndim, lnprob_ind, args=data)
In [9]:
#Settling-in phase
print("Running burn-in...")
p0, _, _ = sampler.run_mcmc(p0, 500)   
Running burn-in...
In [10]:
#reset the sampler and run the production phase
sampler.reset()
print("Running production...")
sampler.run_mcmc(p0, 1000)
Running production...
Out[10]:
(array([[ -5.87564482e-02,   4.45725587e-03,  -9.76768883e-01,
           2.07354250e-01,   4.48748977e-01],
        [ -5.27793351e-02,  -1.39297789e-02,  -9.73948911e-01,
           1.40906343e-01,   3.75504943e-01],
        [ -5.15163782e-02,  -1.78597658e-02,  -1.02613265e+00,
           1.95742941e-01,   3.61908479e-01],
        [ -5.41098957e-02,  -9.43724290e-03,  -9.79860371e-01,
           1.96426419e-01,   4.51182093e-01],
        [ -5.57710534e-02,  -2.33517133e-02,  -9.59660777e-01,
           2.00571975e-01,   4.05356941e-01],
        [ -6.08069874e-02,  -4.04021964e-02,  -9.77834379e-01,
           1.61133378e-01,   3.74839592e-01],
        [ -5.28672691e-02,  -1.39857851e-02,  -1.06512141e+00,
           1.68122577e-01,   3.28043229e-01],
        [ -5.07782290e-02,  -1.61262401e-02,  -1.02038439e+00,
           1.40127874e-01,   3.44410264e-01],
        [ -5.38080599e-02,   8.89137791e-03,  -9.62751458e-01,
           2.51262007e-01,   4.25910866e-01],
        [ -6.09790073e-02,  -4.61473409e-02,  -9.38430439e-01,
           1.69098485e-01,   3.76546919e-01],
        [ -6.07798354e-02,   5.49989733e-04,  -9.40252788e-01,
           1.30260142e-01,   4.16013958e-01],
        [ -5.13500416e-02,  -6.90281984e-03,  -9.56639271e-01,
           1.96271242e-01,   4.20850400e-01],
        [ -5.25791255e-02,  -1.48112636e-02,  -1.00240127e+00,
           1.83088245e-01,   3.83991517e-01],
        [ -5.45159741e-02,  -2.72768627e-02,  -9.71232276e-01,
           2.25624278e-01,   4.31624676e-01],
        [ -4.75616371e-02,  -3.09775700e-02,  -1.01407231e+00,
           2.01510153e-01,   4.01532738e-01],
        [ -5.26673350e-02,  -4.11976353e-02,  -9.70056440e-01,
           1.94334007e-01,   3.34894686e-01],
        [ -6.16633514e-02,  -1.61824037e-02,  -9.87356572e-01,
           1.33928925e-01,   3.84342695e-01],
        [ -5.63372181e-02,  -3.51284516e-02,  -9.73639351e-01,
           2.30974209e-01,   3.62985520e-01],
        [ -5.17984992e-02,  -3.17422630e-02,  -9.67112606e-01,
           1.95636081e-01,   3.58578992e-01],
        [ -5.51327522e-02,  -2.15328258e-02,  -9.37194341e-01,
           1.61309497e-01,   3.96104233e-01],
        [ -5.11616377e-02,  -6.62443372e-03,  -9.81257437e-01,
           1.95505975e-01,   4.19841914e-01],
        [ -6.18520322e-02,  -2.15127707e-02,  -9.49118861e-01,
           1.49433391e-01,   3.96352874e-01],
        [ -4.74355950e-02,  -1.19227518e-02,  -1.00334475e+00,
           2.11516450e-01,   4.15325430e-01],
        [ -5.49429733e-02,  -2.95969283e-02,  -9.56790926e-01,
           1.58511197e-01,   4.04460178e-01],
        [ -5.95214928e-02,  -4.14968871e-02,  -9.00244217e-01,
           1.49437329e-01,   4.27856468e-01],
        [ -5.20070615e-02,  -1.63427322e-02,  -9.77856562e-01,
           2.17341105e-01,   3.60768405e-01],
        [ -5.32277754e-02,  -2.15909108e-02,  -1.03515184e+00,
           1.78958206e-01,   3.30433576e-01],
        [ -5.62706301e-02,  -4.30890630e-02,  -9.49587462e-01,
           1.81027961e-01,   3.38263441e-01],
        [ -6.20615282e-02,  -1.26466033e-02,  -1.01511919e+00,
           1.48625417e-01,   3.84306572e-01],
        [ -5.27818059e-02,  -3.29529906e-02,  -9.78419288e-01,
           1.42892870e-01,   3.39345513e-01],
        [ -5.19464400e-02,  -2.56347826e-02,  -9.72607024e-01,
           1.86795125e-01,   4.10669803e-01],
        [ -5.47472575e-02,  -2.50248623e-02,  -9.74500134e-01,
           1.78662246e-01,   3.86161258e-01]]),
 array([-55.68669253, -54.55582998, -54.17264972, -54.88835874,
        -53.03420609, -54.84477671, -56.74694519, -55.89882872,
        -59.25194476, -55.47045248, -58.06582868, -54.22152748,
        -52.95763324, -56.2332568 , -57.81174257, -54.24916674,
        -54.60042872, -55.29149725, -53.29418695, -53.63344728,
        -53.96926247, -54.43361545, -56.13534528, -53.49275309,
        -57.89078073, -54.5058761 , -54.73666041, -54.36567248,
        -54.60306316, -54.76051958, -53.57347416, -52.40476477]),
 ('MT19937', array([1982170676, 3197626560, 2034110405,  364013929, 2706455048,
         3148782550,  337132009, 3876564620, 1725886813, 2518980791,
         3179546027,  884176295,  826738762, 1381950202, 3488267631,
          168998003,  906940475,  610961577, 2767672024, 1231976069,
         2970483682, 2864665650, 2435722320, 1128805416, 3346153498,
          996165378, 2132664392, 3993934367,  744417589,  481288017,
         3184373532, 2277283987, 2402238417, 1610759129, 3970590976,
         1834862281, 1740191689, 1314781708, 4245885832, 1400817231,
          453255666, 2187717756,  296462637, 2186409580, 4028116907,
         1340178333,  869079662, 3481178880, 2272512797, 4168869163,
          747388346, 2950447477, 4201804868, 3996455729, 4227846681,
          168585389, 3097594807, 3331903391, 2892611195, 3029410092,
         3366108583,  575839419, 1614531418, 3104695517, 3975918202,
          862892929, 3904904971, 1963122520, 2752681592, 2058614645,
          595343386, 2052022584,  776836601, 3747680924, 1163591419,
         1003381969,   40895345, 1639495779,  366744671, 1394165158,
          509760057, 1708976183,  948920018, 2836289941, 2625883221,
         1798513578, 1286463479, 3865617297,  408938309, 4227018032,
         3288133240, 3743906381, 1188058716, 3056253247, 2665338143,
         1490837355,  781428379, 2700438826, 2833195944, 1842236608,
         1032995191, 3915796728, 3020361532,  248093725,   93665184,
         3453747141, 2436687922, 2775449778, 2400831641, 3089199170,
         1340614875, 1561333938, 3723374507, 2206310909, 1949170494,
         4099968542, 2094244580, 2393381628, 3915405780,  725795541,
         2755316662, 1059232780, 4032229652, 1691650528, 2497819940,
         3175671014,  525855761, 2714099687,  204619280, 2482712711,
         1100011843,  130699108,    7759552, 3225202193, 2363041387,
         2964696450, 3046485241, 1326711365, 2160359802, 2486877122,
         2110775472, 1197331436, 3853244862, 1862213308, 2597418284,
         1597836527, 2204996917, 2507964164, 3883017336, 1178504160,
         1663439619,  906194816, 3783996162, 3026297141, 4197488542,
         2690444699, 1355866465, 2070158019,  951856295, 1468454409,
         2724441709,  626126507,   84807465, 1364644720, 1135288206,
          647417455, 1067586051,  206743726, 3640057528, 2114759496,
         3424199279,  531635650, 2412733665,  659566227, 2674138274,
          306516744,  476386366, 1829160710, 2417622457, 3380184503,
         3423666269, 3881165114, 3072857962, 2732149527, 4286460426,
         3862215972, 1898172832, 2981937262,  968158789, 2504041961,
         3135870758,  457395651,  441240844, 2111683944, 2824087773,
         3810289599, 4051230229, 4292295371, 2378422843, 1428986201,
         3274196422,  491213002, 1812035883, 1851777038,  131871308,
         3576022887, 2624576292, 3776808677, 3546953291, 1573342607,
          205122434, 3049636294, 1906959885, 2272060963, 3047104533,
         3288787604, 3116138279, 1731893682, 1898866686, 2880966497,
          230654134,  893580833, 3793038370,  699061440, 2089366626,
         2313045211, 2414485037, 2556533265, 2565869173, 1040535486,
         1376751036, 2961376815, 1359010922,  319801440, 2954551673,
          853583718, 4167146577,  434278343,  571151479, 2141803486,
          556167489,  138661083,  715491320, 3179056873, 4250016107,
         1666960793, 3094073724,  319923594,  401576991,  942474585,
         2093096925, 2365655276, 2350212177, 3908451186, 1959014132,
         4249629977,  191035627,  260887815, 3717806157,   89881492,
          798877451,   45220411,   17323922,  844312371, 1808063489,
         3006715832, 3432532689,  335743905, 2056878299, 2204490837,
         3334210285, 1667633579, 3111144905, 1560125335, 3676312408,
         2394903318, 1078119694, 1330409741, 3852051104,  348944961,
         1248344543, 1303004371, 1993587984, 3675702431, 1508799630,
         3616713868, 3119260440, 2789162047, 3959375849, 4053617776,
          368792751, 2738064740, 2979007653,   60097859, 2627512517,
          446160799, 1403904852, 2052966540, 1738274767, 3723714466,
         2969843004, 1079033526,  788662042, 3082359681,    4065796,
         2544815442, 2994843929, 4049172365, 3152267812, 2671146734,
         1610239255, 2252760166,  221304155,  873971616, 3426486029,
         1358137138, 1199338472,  214308116, 3116311265, 4041616739,
         2892709046,   74399298, 2730212390, 2688008435, 4118315484,
         2405598415, 1283164243, 3783857982, 2593722470, 2322951552,
         4135128071, 1327374096, 3805001530, 3292098934, 1146379601,
         3169070754,  203987480, 3220532147,  209374595, 2013123612,
         1922099036,  876684721, 3806329810, 2940549003, 2451758980,
         2813660641, 2180954011, 2757038543, 1666705638, 3242955675,
         1666038821, 3195906973,  713676846,  771109917, 1029773847,
         1258759952, 3996501335,   96673332, 2260924331, 3943917165,
          535311529,  516347105, 3284848203, 1166776322, 2189447493,
         3661144665, 2678925765, 1018481042,  538023993, 1453362413,
         2646132469, 2340368707, 1474447017, 3201661642, 4090473805,
           54216673, 2142350346, 3821922838, 4224747834, 2195550064,
         4282880176, 2948828216,  465353391, 2998265244, 3763325005,
         4003632408, 3745400201, 3891749859, 2147652446, 2808358392,
         2963483220, 3203106096,   18211347, 4268416937, 1484995627,
         3952095968, 3944545300, 4198745169, 3917547795,  225472753,
         3644849439, 2435689059, 3277479372, 1516408838,  742378275,
          883260283,  840624312,  284749702, 1945461179, 3123405673,
         2055967024, 3145438173, 2835124931, 3965143746, 1603726906,
         2471328707, 3061378378, 1461027003, 3542285242, 1026165933,
         2398513232, 3469402559, 3330870852, 2985540819, 2868656066,
         1428134060, 1148562631, 3292157842,  615883218,  940775973,
         3608490085,  895424150, 3809541593, 2014410817, 1227448501,
         2414817701, 1283767196, 2062623894,  320414565, 3521615092,
          255312155, 1778584965, 1759179479, 1764374572, 2446790484,
         3264689122, 4231573233, 3662331192, 1063774645,  312787885,
           65256055, 1209616633,  587940947, 1224524312, 2026914004,
         3162007333,  654961844, 1378406424, 2519206798, 1697519649,
         3584592295, 2697817600, 2419232552, 1762933960, 3829785755,
         1884734400, 2226783759, 3592671859,  682377432, 2785805740,
         3253318350, 2868062688, 3722761751,  433730188,  115022844,
          201846021,  414172892,  984936062, 2834529990, 1565642376,
         1600310684, 3694145811, 2600832964, 2620972601, 1580689753,
         2921570303, 4197844444, 4132711790,  790442022, 2858012313,
         1856185445, 2672987771, 1188390177, 2442178327, 3731125279,
         2031332772, 2362042368, 3727614766, 2973144615, 2447092882,
         1802841518, 1985679138, 2219407564, 2536080776, 1738634295,
          176486842,   54642931, 3864072207,  187585706, 3975471675,
         3920705726,  198522695, 2497454416, 1020515660, 1691649019,
          115026349, 2795791779, 4259682548, 1816604237,  276024205,
         4231806996, 2067133810, 2193413146, 3455269791, 3626151255,
         2030644725,  758899287, 1813858472,  534800135,   10404424,
          751278730, 4176678860, 2580423136,  909845459, 3471418389,
         1357534914, 1334802464, 3195598036, 3479230277, 3994045067,
         3428699954, 1631996852,  128844618, 2581399159, 1021583910,
         2409325675,  871417823, 3010929496, 3507658178, 1649970196,
         2913829786,  941077051, 3864849906,    5646893, 1413865482,
          304831662,  328617699, 3872777401, 3221884419, 2169200350,
         3958872196, 2816089498, 4029894363, 3690887986,  583092255,
          316044220,   64857077, 3200678036, 2873328308, 3887325709,
         1291963902, 3638425548,   63549584,  607283856, 3827095854,
         4136225763, 1015675965,  413172091, 1136981454,  268261389,
         2442122370, 1551996820, 1647808575, 2638522716, 4110779849,
          930993871, 4060188570, 2204651426, 2080784863, 2729842188,
         2108587858, 3864459362,  455037749, 3807926147,  532909520,
           41256957, 1480547165,  444982801,  776321300, 2134034052,
         2617942476, 2887770895, 1825934716, 3382256314, 3424989799,
         2709589969, 2355703843,  469351432, 2922440347, 1997138979,
         2555055812, 3599810344, 2554216918, 4136712854, 3499504826,
         2868528866, 3334027581, 1855666422, 3197185912, 2406396667,
          314463649,  825266697, 3257645402, 3490594574], dtype=uint32), 384, 0, 0.0))
In [11]:
samples = sampler.flatchain
In [12]:
np.shape(samples)
Out[12]:
(32000, 5)
In [13]:
# Plot the samples in data space. We draw 24 posterior samples randomly.
print("Making plots")
samples = sampler.flatchain          #flatten the chain (stitch all 32 walkers of 1000 long together)
rcParams['figure.figsize'] = 10, 7
plt.figure(2)
plt.ylabel(r"$y$",fontsize=15)
plt.xlabel(r"$t$",fontsize=15)
plt.xlim(-5, 5)
plt.errorbar(t, y, yerr=yerr, fmt=".k", capsize=0)
x = np.linspace(-5, 5, 500)
for s in samples[np.random.randint(len(samples), size=24)]:
    plt.plot(x, model(s[2:], x)+s[0]*x+s[1], color="#4682b4", alpha=0.3)
plt.title(r'Results~assuming~uncorrelated~noise')
#plt.savefig("final_fit_WN.pdf", dpi=150)
Making plots
Out[13]:
<matplotlib.text.Text at 0x11709da10>
In [14]:
#now we can look at the parameter evolutions in the production chain
from matplotlib.ticker import MaxNLocator

plt.rcParams['xtick.labelsize'] = 10
plt.rcParams['ytick.labelsize'] = 10
plt.clf()
fig, axes = plt.subplots(3, 1, sharex=True, figsize=(8, 6))
for i in range(3):
    axes[i].yaxis.label.set_size(20)
    axes[i].yaxis.set_major_locator(MaxNLocator(3))
    axes[i].plot(sampler.chain[:, :, i+2].T, color="k", alpha=0.4)
    axes[i].axhline(truth[i], color="#888888", lw=2)
axes[0].set_ylabel(r"$\alpha$")
axes[1].set_ylabel(r"$l$")
axes[2].set_ylabel(r"$\sigma^2$")
axes[2].set_xlabel("step number",fontsize=20)
fig.tight_layout(h_pad=0.0)
plt.show()
<matplotlib.figure.Figure at 0x117087990>
In [15]:
#And now the final results.  From the analysis of the chain, we obtain our poterior distributions.
import corner
plt.figure()
plt.rcParams['axes.titlesize'] = 17
plt.rcParams['axes.labelsize'] = 17
labels = [r"$\alpha$", r"$\ell$", r"$\sigma^2$"]
fig = corner.corner(samples[:, 2:], bins=50,truths=truth,labels=labels,show_titles=True,
                    quantiles=(0.16,0.5,0.84))
#fig.savefig("posteriors.pdf", dpi=150)
<matplotlib.figure.Figure at 0x1170889d0>

2.1 Conclusions

Looking at the final fit, at first this seems quite satisfactory. However, closer look at the final posterior probability distributions, it is clear that our final results for the $\ell$ parameter is WRONG based on the current error estimates.

3. The systematic noise model case

Now instead of assuming that the noise is white, we generalise our likelihood function to include covariances between the data points (red noise).

3.1 Gaussian Processes

adapted from Neale Gibson's papers A Gaussian Process (GP) is a non-parametric method for regression, used extensively for regression and classification problems in the machine learning community. A GP is formally defined as an collection of random variables y, any finite number of which have a joint Gaussian distribution, e.g.:

$$\boldsymbol{y} \sim N(\mu,\Sigma)$$

Generally speaking, when we fit a model or a mean function with parameters $\theta$ to data t and y, we construct a likelihood function under the assumption of independent Gaussian uncertainties, $\sigma_i$.

$$ \mathrm{likelihood} = \prod_{n=1}^N \frac{1}{\sqrt{2\pi\sigma^2}} ~~ \exp \left(-\frac{[y_n-f_{\theta}(t_n)]^2}{2\sigma_n^2} \right) ~~~~~~~~ (4.1) $$

This likelihood is central to inferring the probabilty distributions of our parameters $\theta$. We can optimise it with respect to $\theta$, or construct a posterior distribution for $\theta$ via Bayes theorem after defining priors. Note that the term inside the exponential is $-\frac{1}{2}\chi^2$, so optimising the likelihood is equivalent to minimising $\chi^2$ if the uncertainties are held fixed. This is already a Gaussian process, albeit a trivial one.

More generally, we can use a multi-variate Gaussian distribution as our likelihood, where we consider covariance between the data points:

$$ \mathrm{likelihood} = \frac{1}{|2\pi\Sigma|^{1/2}} ~~ \exp \left( -\frac{1}{2} \boldsymbol{r}^T \Sigma^{-1}\boldsymbol{r} \right) ~~~~~~~~ (4.2)$$

with $\Sigma$ being the covariance matrix and $\boldsymbol{r} = \{y_n-f_\theta (t_n)\}$.

3.2 Modelling our noise

Based on the description above, we can rewrite our likelihood function (2) as a matrix equation (essentially just rewriting equation (4.2)):

$$ \ln P\left({y_n}|{t_n},{\sigma_n^2},\theta\right) = -\frac{1}{2} \boldsymbol{r}^\mathrm{T}\Sigma^{-1}\boldsymbol{r} -\frac{1}{2} \ln |\Sigma| - \frac{N}{2} \ln 2\pi ~~~~~~~~ (5)$$

If we assume purely white noise, as we did in the previous section, then what we are saying is that the covariance matrix, $\Sigma$, is diagonal. Subsequently, we include the presence of red noise by populating the off-diagonal elements (essentially introducing co-variance).

Ideally, we would like to make every off-diagonal element a free parameter, but that would be too many free parameters. So instead we "model" these elements using a covariance function, or a kernel. In this framework, the covariance matrix is given by:

$$ \Sigma_{ij} = \sigma_i^2\delta_{ij} + k(t_i,t_j) ~~~~~~~~ (6)$$

where $\delta$ is the Kronecker delta and $k$ is the kernel.

3.2.1 Kernels

There are lots of different kernels that you can use. The use of each form is usually dictated by the form of systematics present in the data. For instance, if the systematics are highly correlated with some physical variable (orbital phase of HST), better known as optical state parameters, Matern family of kernels is preffered. Here those physical variants will be inputs of the kernel, i.e. the elements of the covariance matrix will be modelled based on physical measurements of systematic causations. Examples of covariance kernels are:

  • Matern family (1/2,3/2,5/2,...)
  • Exponential
  • SquaredExponential (SE) (This is generally best suited to ground transit LCs, since it is infinitely smooth)
  • Rational Quadratic (RQ)
  • Periodic family
  • Linear
  • ...

You should always think about your data and which kernel suites the modelling best. If in doubt, try different ones and distinguish goodness through statistical tests (BIC, ratio test, ...). Also consider defining your own if none of standard forms suit your purpose.

3.2.2 A better modelling of our data

To model the systematics in our previously defined data, I will choose the Matern-3/2 kernel. Of course it would be ideal to use the SE kernel, where systematics in simulated data were drawn from. But to make the exercise a bit more realistic, I use a different kernel. This kernel is given by:

$$ k(t_i,t_j) = \zeta^2 \left(1+\frac{\sqrt{3}~|t_i-t_j|}{\tau} \right) ~ \exp \left(-\frac{\sqrt{3}~|t_i-t_j|}{\tau} \right) ~~~~~~~(7)$$

where $\zeta$ and $\tau$ are the parameters of kernel, describing the amplitude and smoothness of the function. We are now ready to redfine the log-likelihood function which includes the GP. The python module, george, used already to simulate the data can be used to create our GP object. There are other freely availabel modules that enable you to define a GP, such as sklearn from scikit, or Neale Gibson's GeePea (which I personally use).

Drawing Drawing

In [16]:
import george
from george import kernels

#likelihood function with a GP
def lnlike_gp(p, t, y, yerr):
    a, tau = np.exp(p[:2])
    gp = george.GP(a * kernels.Matern32Kernel(tau)) #Create the base Gaussian Process object.
    gp.compute(t, yerr)    #compute the covariance matrix and factorize it for a set of times
                           #and uncertainties.
    return gp.lnlikelihood(y - model(p[2:], t))

#addings priors for the kernel parameters to the general base prior
def lnprior_gp(p):
    lna, lntau = p[:2]
    if not -5 < lna < 5:
        return -np.inf
    if not -5 < lntau < 5:
        return -np.inf
    return lnprior_base(p[2:])
"""
Calcualte posterior function.  You should notice that the linear trend function is no longer
added to the model.  This trend will now be accounted for by our GP model. However, if there
is a physical measurable that is the cause of this trend (airmass for instance), then the
trend can still be included.
"""
def lnprob_gp(p, t, y, yerr):
    lp = lnprior_gp(p)
    if not np.isfinite(lp):
        return -np.inf
    return lp + lnlike_gp(p, t, y, yerr)
In [17]:
#make the entire fitting process a funcational form, for general purpose use.
def fit_gp(initial, data, nwalkers=32):
    ndim = len(initial)
    p0 = [np.array(initial) + 1e-8 * np.random.randn(ndim)
          for i in xrange(nwalkers)]
    sampler = emcee.EnsembleSampler(nwalkers, ndim, lnprob_gp, args=data)

    print("Running burn-in")
    p0, lnp, _ = sampler.run_mcmc(p0, 500)
    sampler.reset()

    print("Running second burn-in")
    p = p0[np.argmax(lnp)]
    p0 = [p + 1e-8 * np.random.randn(ndim) for i in xrange(nwalkers)]
    p0, _, _ = sampler.run_mcmc(p0, 500)
    sampler.reset()

    print("Running production")
    p0, _, _ = sampler.run_mcmc(p0, 1000)

    return sampler
In [18]:
print("Fitting GP")
data = (t, y, yerr)
truth_gp = [0.0, 0.0] + truth   #initial guesses for the parameters
sampler = fit_gp(truth_gp, data)
Fitting GP
Running burn-in
Running second burn-in
Running production
In [19]:
print("Making plots")
samples = sampler.flatchain
x = np.linspace(-5, 5, 500)
plt.figure()
plt.errorbar(t, y, yerr=yerr, fmt=".k", capsize=0)
#draw 100 random instances of the GP model from the posterior samples
for s in samples[np.random.randint(len(samples), size=100)]:
    gp = george.GP(np.exp(s[0]) * kernels.Matern32Kernel(np.exp(s[1])))
    gp.compute(t, yerr)
    m = gp.sample_conditional(y - model(s[2:], t), x) + model(s[2:], x)
    plt.plot(x, m, color="#4682b4", alpha=0.2)
plt.ylabel(r"$y$",fontsize=20)
plt.xlabel(r"$t$",fontsize=20)
plt.xlim(-5, 5)
plt.title("results with Gaussian process noise model",fontsize=15)
#plt.savefig("GP-model-fit.pdf", dpi=150)

# Make the corner plot.
plt.figure()
plt.rcParams['axes.titlesize'] = 17
plt.rcParams['axes.labelsize'] = 17
fig = corner.corner(samples[:, 2:], bins=50, truths=truth, labels=labels,show_titles=True,
                    quantiles=(0.16,0.5,0.84))
#fig.savefig("GP-posteriors.pdf", dpi=150)
Making plots
<matplotlib.figure.Figure at 0x131d43f10>

And Voila!! The model that includes correlated noise is clearly less precise (does not underestimate the error bars) and it is more accurate. Moral of the story, correlated noise modelling is less biased.

If you are by any chance inspired to start using python, I highly recommend using spyder as your editor. It enables you to have multiple running environments (be it python or ipython), as well as cell creation, lbl compilation, debugging features and many more. At the beginning I went through a few different options, and this by far and away is my favourite.

In [ ]: