Skip to content

Godambe

Parameter uncertainties and likelihood ratio tests using Godambe information.

FIM_uncert(func_ex, grid_pts, p0, data, log=False, multinom=True, eps=0.01, return_FIM=False)

Parameter uncertainties from Fisher Information Matrix

Returns standard deviations of parameter values.

Parameters:

Name Type Description Default
func_ex func

Model function

required
grid_pts list[int]

List of grid points to evaluate func_ex

required
p0 list[float]

Best-fit parameters for func_ex

required
data Spectrum

Original data frequency spectrum

required
log bool

If True, assume log-normal distribution of parameters. Returned values are then the standard deviations of the logs of the parameter values, which can be interpreted as relative parameter uncertainties.

False
multinom bool

If True, assume model is defined without an explicit parameter for theta. Because uncertainty in theta must be accounted for to get correct uncertainties for other parameters, this function will automatically consider theta if multinom=True. In that case, the final entry of the returned uncertainties will correspond to theta.

True
eps float

Fractional stepsize to use when taking finite-difference derivatives. Note that if eps*param is < 1e-6, then the step size for that parameter will simply be eps, to avoid numerical issues with small parameter perturbations.

0.01
return_FIM bool

If true, also return the full FIM.

False
Source code in dadi/Godambe.py
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
def FIM_uncert(func_ex, grid_pts, p0, data, log=False, multinom=True, eps=0.01, return_FIM=False):
    """
    Parameter uncertainties from Fisher Information Matrix

    Returns standard deviations of parameter values.

    Args:
        func_ex (func): Model function
        grid_pts (list[int]): List of grid points to evaluate func_ex
        p0 (list[float]): Best-fit parameters for func_ex
        data (Spectrum): Original data frequency spectrum
        log (bool): If True, assume log-normal distribution of parameters. Returned values
            are then the standard deviations of the *logs* of the parameter values,
            which can be interpreted as relative parameter uncertainties.
        multinom (bool): If True, assume model is defined without an explicit parameter for
                theta. Because uncertainty in theta must be accounted for to get
                correct uncertainties for other parameters, this function will
                automatically consider theta if multinom=True. In that case, the
                final entry of the returned uncertainties will correspond to
                theta.
        eps (float): Fractional stepsize to use when taking finite-difference derivatives.
            Note that if eps*param is < 1e-6, then the step size for that parameter
            will simply be eps, to avoid numerical issues with small parameter
            perturbations.
        return_FIM (bool): If true, also return the full FIM.
    """
    if multinom:
        func_multi = func_ex
        model = func_multi(p0, data.sample_sizes, grid_pts)
        theta_opt = Inference.optimal_sfs_scaling(model, data)
        p0 = list(p0) + [theta_opt]
        func_ex = lambda p, ns, pts: p[-1]*func_multi(p[:-1], ns, pts)
    H = get_godambe(func_ex, grid_pts, [], p0, data, eps, log, just_hess=True)
    uncerts = numpy.sqrt(numpy.diag(numpy.linalg.inv(H)))
    if not return_FIM:
        return uncerts
    else:
        return uncerts, H

GIM_uncert(func_ex, grid_pts, all_boot, p0, data, log=False, multinom=True, eps=0.01, return_GIM=False, boot_theta_adjusts=None)

Parameter uncertainties from Godambe Information Matrix (GIM)

Returns standard deviations of parameter values.

Parameters:

Name Type Description Default
func_ex func

Model function

required
grid_pts list[int]

Grid points at which to evaluate func_ex

required
all_boot list[Spectrum]

List of bootstrap frequency spectra

required
p0 list[flaot]

Best-fit parameters for func_ex

required
data Spectrum

Original data frequency spectrum

required
log bool

If True, assume log-normal distribution of parameters. Returned values are then the standard deviations of the logs of the parameter values, which can be interpreted as relative parameter uncertainties.

False
multinom bool

If True, assume model is defined without an explicit parameter for theta. Because uncertainty in theta must be accounted for to get correct uncertainties for other parameters, this function will automatically consider theta if multinom=True. In that case, the final entry of the returned uncertainties will correspond to theta.

True
eps float

Fractional stepsize to use when taking finite-difference derivatives. Note that if eps*param is < 1e-6, then the step size for that parameter will simply be eps, to avoid numerical issues with small parameter perturbations.

0.01
return_GIM bool

If true, also return the full GIM.

False
boot_theta_adjusts list[float]

Optionally, a sequence of relative values of theta (compared to original data) to assume for bootstrap data sets. Only valid when multinom=False.

None
Source code in dadi/Godambe.py
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
def GIM_uncert(func_ex, grid_pts, all_boot, p0, data, log=False,
               multinom=True, eps=0.01, return_GIM=False,
               boot_theta_adjusts=None):
    """
    Parameter uncertainties from Godambe Information Matrix (GIM)

    Returns standard deviations of parameter values.

    Args:
        func_ex (func): Model function
        grid_pts (list[int]): Grid points at which to evaluate func_ex
        all_boot (list[Spectrum]): List of bootstrap frequency spectra
        p0 (list[flaot]): Best-fit parameters for func_ex
        data (Spectrum): Original data frequency spectrum
        log (bool): If True, assume log-normal distribution of parameters. Returned values
            are then the standard deviations of the *logs* of the parameter values,
            which can be interpreted as relative parameter uncertainties.
        multinom (bool): If True, assume model is defined without an explicit parameter for
                theta. Because uncertainty in theta must be accounted for to get
                correct uncertainties for other parameters, this function will
                automatically consider theta if multinom=True. In that case, the
                final entry of the returned uncertainties will correspond to
                theta.
        eps (float): Fractional stepsize to use when taking finite-difference derivatives.
            Note that if eps*param is < 1e-6, then the step size for that parameter
            will simply be eps, to avoid numerical issues with small parameter
            perturbations.
        return_GIM (bool): If true, also return the full GIM.
        boot_theta_adjusts (list[float]): Optionally, a sequence of *relative* values of theta
                            (compared to original data) to assume for bootstrap
                            data sets. Only valid when multinom=False.
    """
    if multinom:
        if boot_theta_adjusts:
            raise ValueError('boot_thetas option can only be used with '
                             'multinom=False')
        func_multi = func_ex
        model = func_multi(p0, data.sample_sizes, grid_pts)
        theta_opt = Inference.optimal_sfs_scaling(model, data)
        p0 = list(p0) + [theta_opt]
        func_ex = lambda p, ns, pts: p[-1]*func_multi(p[:-1], ns, pts)
    GIM, H, J, cU = get_godambe(func_ex, grid_pts, all_boot, p0, data, eps, log,
                                boot_theta_adjusts=boot_theta_adjusts)
    uncerts = numpy.sqrt(numpy.diag(numpy.linalg.inv(GIM)))
    if not return_GIM:
        return uncerts
    else:
        return uncerts, GIM, H

LRT_adjust(func_ex, grid_pts, all_boot, p0, data, nested_indices, multinom=True, eps=0.01, boot_theta_adjusts=None)

First-order moment matching adjustment factor for likelihood ratio test

Parameters:

Name Type Description Default
func_ex func

Model function for complex model

required
grid_pts list[float]

Grid points at which to evaluate func_ex

required
all_boot list[Spectrum]

List of bootstrap frequency spectra

required
p0 list[float]

Best-fit parameters for the simple model, with nested parameter explicity defined. Although equal to values for simple model, should be in a list form that can be taken in by the complex model you'd like to evaluate.

required
data Spectrum

Original data frequency spectrum

required
nested_indices list[float]

List of positions of nested parameters in complex model parameter list

required
multinom bool

If True, assume model is defined without an explicit parameter for theta. Because uncertainty in theta must be accounted for to get correct uncertainties for other parameters, this function will automatically consider theta if multinom=True.

True
eps float

Fractional stepsize to use when taking finite-difference derivatives Note that if eps*param is < 1e-6, then the step size for that parameter will simply be eps, to avoid numerical issues with small parameter perturbations.

0.01
boot_theta_adjusts list[float]

Optionally, a sequence of relative values of theta (compared to original data) to assume for bootstrap data sets. Only valid when multinom=False.

None
Source code in dadi/Godambe.py
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
def LRT_adjust(func_ex, grid_pts, all_boot, p0, data, nested_indices,
               multinom=True, eps=0.01, boot_theta_adjusts=None):
    """
    First-order moment matching adjustment factor for likelihood ratio test

    Args:
        func_ex (func): Model function for complex model
        grid_pts (list[float]): Grid points at which to evaluate func_ex
        all_boot (list[Spectrum]): List of bootstrap frequency spectra
        p0 (list[float]): Best-fit parameters for the simple model, with nested parameter
            explicity defined.  Although equal to values for simple model, should
            be in a list form that can be taken in by the complex model you'd like
            to evaluate.
        data (Spectrum): Original data frequency spectrum
        nested_indices (list[float]): List of positions of nested parameters in complex model
                        parameter list
        multinom (bool): If True, assume model is defined without an explicit parameter for
                theta. Because uncertainty in theta must be accounted for to get
                correct uncertainties for other parameters, this function will
                automatically consider theta if multinom=True.
        eps (float): Fractional stepsize to use when taking finite-difference derivatives
            Note that if eps*param is < 1e-6, then the step size for that parameter
            will simply be eps, to avoid numerical issues with small parameter
            perturbations.
        boot_theta_adjusts (list[float]): Optionally, a sequence of *relative* values of theta
                            (compared to original data) to assume for bootstrap
                            data sets. Only valid when multinom=False.
    """
    if multinom and boot_theta_adjusts:
        raise ValueError('Cannot use boot_theta_adjusts with multinom=True.')
    if multinom:
        func_multi = func_ex
        model = func_multi(p0, data.sample_sizes, grid_pts)
        theta_opt = Inference.optimal_sfs_scaling(model, data)
        p0 = list(p0) + [theta_opt]
        func_ex = lambda p, ns, pts: p[-1]*func_multi(p[:-1], ns, pts)

    # We only need to take derivatives with respect to the parameters in the
    # complex model that have been set to specified values in the simple model
    def diff_func(diff_params, ns, grid_pts):
        # diff_params argument is only the nested parameters. All the rest
        # should come from p0
        full_params = numpy.array(p0, copy=True, dtype=float)
        # Use numpy indexing to set relevant parameters
        full_params[nested_indices] = diff_params
        return func_ex(full_params, ns, grid_pts)

    p_nested = numpy.asarray(p0)[nested_indices]
    GIM, H, J, cU = get_godambe(diff_func, grid_pts, all_boot, p_nested, data, 
                                eps, log=False, boot_theta_adjusts=boot_theta_adjusts)

    adjust = len(nested_indices)/numpy.trace(numpy.dot(J, numpy.linalg.inv(H)))
    return adjust

Wald_stat(func_ex, grid_pts, all_boot, p0, data, nested_indices, full_params, multinom=True, eps=0.01, adj_and_org=False)

Calculate test stastic from wald test

Parameters:

Name Type Description Default
func_ex func

Model function for complex model

required
grid_pts list[int]

Grid points to evaluate model function

required
all_boot list[Spectrum]

List of bootstrap frequency spectra

required
p0 list[float]

Best-fit parameters for the simple model, with nested parameter explicity defined. Although equal to values for simple model, should be in a list form that can be taken in by the complex model you'd like to evaluate.

required
data list[Spectrum]

Original data frequency spectrum

required
nested_indices list[int]

List of positions of nested parameters in complex model parameter list.

required
full_params list[float]

Parameter values for parameters found only in complex model, Can either be array with just values found only in the compelx model, or entire list of parameters from complex model.

required
multinom bool

If True, assume model is defined without an explicit parameter for theta. Because uncertainty in theta must be accounted for to get correct uncertainties for other parameters, this function will automatically consider theta if multinom=True. In that case, the final entry of the returned uncertainties will correspond to theta.

True
eps float

Fractional stepsize to use when taking finite-difference derivatives Note that if eps*param is < 1e-6, then the step size for that parameter will simply be eps, to avoid numerical issues with small parameter perturbations.

0.01
adj_and_org bool

If False, return only adjusted Wald statistic. If True, also return unadjusted statistic as second return value.

False
Source code in dadi/Godambe.py
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
def Wald_stat(func_ex, grid_pts, all_boot, p0, data, nested_indices,
              full_params, multinom=True, eps=0.01, adj_and_org=False):
    # XXX: Implement boot_theta_adjusts
    """
    Calculate test stastic from wald test

    Args:
        func_ex (func): Model function for complex model
        grid_pts (list[int]): Grid points to evaluate model function
        all_boot (list[Spectrum]): List of bootstrap frequency spectra
        p0 (list[float]): Best-fit parameters for the simple model, with nested parameter
            explicity defined.  Although equal to values for simple model, should
            be in a list form that can be taken in by the complex model you'd like
            to evaluate.
        data (list[Spectrum]): Original data frequency spectrum
        nested_indices (list[int]): List of positions of nested parameters in complex model
                        parameter list.
        full_params (list[float]): Parameter values for parameters found only in complex model,
                    Can either be array with just values found only in the compelx
                    model, or entire list of parameters from complex model.
        multinom (bool): If True, assume model is defined without an explicit parameter for
                theta. Because uncertainty in theta must be accounted for to get
                correct uncertainties for other parameters, this function will
                automatically consider theta if multinom=True. In that case, the
                final entry of the returned uncertainties will correspond to
                theta.
        eps (float): Fractional stepsize to use when taking finite-difference derivatives
            Note that if eps*param is < 1e-6, then the step size for that parameter
            will simply be eps, to avoid numerical issues with small parameter
            perturbations.
        adj_and_org (bool): If False, return only adjusted Wald statistic. If True, also
                    return unadjusted statistic as second return value.
    """
    if multinom:
         func_multi = func_ex
         model = func_multi(p0, data.sample_sizes, grid_pts)
         theta_opt = Inference.optimal_sfs_scaling(model, data)
         # Also need to extend full_params
         if len(full_params) == len(p0):
             full_params = numpy.concatenate((full_params, [theta_opt]))
         p0 = list(p0) + [theta_opt]
         func_ex = lambda p, ns, pts: p[-1]*func_multi(p[:-1], ns, pts)

    # We only need to take derivatives with respect to the parameters in the
    # complex model that have been set to specified values in the simple model
    def diff_func(diff_params, ns, grid_pts):
         # diff_params argument is only the nested parameters. All the rest
         # should come from p0
         full_params = numpy.array(p0, copy=True, dtype=float)
         # Use numpy indexing to set relevant parameters
         full_params[nested_indices] = diff_params
         return func_ex(full_params, ns, grid_pts)

    # Reduce full params list to be same length as nested indices
    if len(full_params) == len(p0):
        full_params = numpy.asarray(full_params)[nested_indices]
    if len(full_params) != len(nested_indices):
        raise KeyError('Full parameters not equal in length to p0 or nested '
                       'indices')

    p_nested = numpy.asarray(p0)[nested_indices]
    GIM, H, J, cU = get_godambe(diff_func, grid_pts, all_boot, p_nested, data, 
                                eps, log=False)
    param_diff = full_params-p_nested

    wald_adj = numpy.dot(numpy.dot(numpy.transpose(param_diff),GIM), param_diff)
    wald_org = numpy.dot(numpy.dot(numpy.transpose(param_diff),H),param_diff)

    if adj_and_org:
        return wald_adj, wald_org
    return wald_adj

get_godambe(func_ex, grid_pts, all_boot, p0, data, eps, log=False, just_hess=False, boot_theta_adjusts=[])

Godambe information and Hessian matrices

Parameters:

Name Type Description Default
func_ex func

Model function

required
grid_pts list[float]

Number of grid points to evaluate the model function

required
all_boot list[Spectrum]

List of bootstrap frequency spectra

required
p0 list[float]

Best-fit parameters for func_ex.

required
data Spectrum

Original data frequency spectrum

required
eps float

Fractional stepsize to use when taking finite-difference derivatives Note that if eps*param is < 1e-6, then the step size for that parameter will simply be eps, to avoid numerical issues with small parameter perturbations.

required
log bool

If True, calculate derivatives in terms of log-parameters

False
just_hess bool

If True, only evaluate and return the Hessian matrix

False
boot_theta_adjusts list[float])

Factors by which to adjust theta for each bootstrap sample, relative to full data theta.

[]
Source code in dadi/Godambe.py
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
def get_godambe(func_ex, grid_pts, all_boot, p0, data, eps, log=False,
                just_hess=False, boot_theta_adjusts=[]):
    """
    Godambe information and Hessian matrices

    Args:
        func_ex (func): Model function
        grid_pts (list[float]): Number of grid points to evaluate the model function
        all_boot (list[Spectrum]): List of bootstrap frequency spectra
        p0 (list[float]): Best-fit parameters for func_ex.
        data (Spectrum): Original data frequency spectrum
        eps (float): Fractional stepsize to use when taking finite-difference derivatives
            Note that if eps*param is < 1e-6, then the step size for that parameter
            will simply be eps, to avoid numerical issues with small parameter
            perturbations.
        log (bool): If True, calculate derivatives in terms of log-parameters
        just_hess (bool): If True, only evaluate and return the Hessian matrix
        boot_theta_adjusts (list[float])): Factors by which to adjust theta for each bootstrap
                            sample, relative to full data theta.
    """
    ns = data.sample_sizes
    if not boot_theta_adjusts:
        boot_theta_adjusts = numpy.ones(len(all_boot))

    # Cache evaluations of the frequency spectrum inside our hessian/J 
    # evaluation function
    def func(params, data, theta_adjust=1):
        key = (func_ex.__hash__(), tuple(params), tuple(ns), tuple(grid_pts))
        if key not in cache:
            cache[key] = func_ex(params, ns, grid_pts)
        # theta_adjust deals with bootstraps that need  different thetas
        fs = theta_adjust*cache[key]
        return Inference.ll(fs, data)
    def log_func(logparams, data, theta_adjust=1):
        return func(numpy.exp(logparams), data, theta_adjust)

    # First calculate the observed hessian.
    # theta_adjust defaults to 1.
    if not log:
        hess = -get_hess(func, p0, eps, args=[data])
    else:
        hess = -get_hess(log_func, numpy.log(p0), eps, args=[data])

    if just_hess:
        return hess

    # Now the expectation of J over the bootstrap data
    J = numpy.zeros((len(p0), len(p0)))
    # cU is a column vector
    cU = numpy.zeros((len(p0),1))
    for ii, (boot,theta_adjust) in enumerate(zip(all_boot, boot_theta_adjusts)):
        boot = Spectrum(boot)
        if not log:
            grad_temp = get_grad(func, p0, eps, args=[boot, theta_adjust])
        else:
            grad_temp = get_grad(log_func, numpy.log(p0), eps,
                                 args=[boot, theta_adjust])
        J_temp = numpy.outer(grad_temp, grad_temp)
        J = J + J_temp
        cU = cU + grad_temp
    J = J/len(all_boot)
    cU = cU/len(all_boot)

    # G = H*J^-1*H
    J_inv = numpy.linalg.inv(J)
    godambe = numpy.dot(numpy.dot(hess, J_inv), hess)
    return godambe, hess, J, cU

get_grad(func, p0, eps, args=())

Calculate gradient vector

Parameters:

Name Type Description Default
func func

Model function

required
p0 list[float]

Parameters for func

required
eps list[float]

Fractional stepsize to use when taking finite-difference derivatives Note that if eps*param is < 1e-6, then the step size for that parameter will simply be eps, to avoid numerical issues with small parameter perturbations.

required
args list or tuple

Additional arguments to func

()
Source code in dadi/Godambe.py
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
def get_grad(func, p0, eps, args=()):
    """
    Calculate gradient vector

    Args:
        func (func): Model function
        p0 (list[float]): Parameters for func
        eps (list[float]): Fractional stepsize to use when taking finite-difference derivatives
            Note that if eps*param is < 1e-6, then the step size for that parameter
            will simply be eps, to avoid numerical issues with small parameter
            perturbations.
        args (list or tuple): Additional arguments to func
    """
    # Calculate step sizes for finite-differences.
    eps_in = eps
    eps = numpy.empty([len(p0)])
    one_sided = [False]*len(p0)
    for i, pval in enumerate(p0):
        if pval != 0:
            # Account for floating point arithmetic issues
            if pval*eps_in < 1e-6:
                eps[i] = eps_in
                one_sided[i] = True
            else:
                eps[i] = eps_in*pval
        else:
            # Account for parameters equal to zero
            eps[i] = eps_in

    grad = numpy.empty([len(p0), 1])
    for ii in range(len(p0)):
        pwork = numpy.array(p0, copy=True, dtype=float)

        if p0[ii] != 0 and not one_sided[ii]:
            pwork[ii] = p0[ii] + eps[ii]
            fp = func(pwork, *args)

            pwork[ii] = p0[ii] - eps[ii]
            fm = func(pwork, *args)

            grad[ii] = (fp - fm)/(2*eps[ii])
        else:
            # Do one-sided finite-difference 
            pwork[ii] = p0[ii] + eps[ii]
            fp = func(pwork, *args)

            pwork[ii] = p0[ii]
            fm = func(pwork, *args)

            grad[ii] = (fp - fm)/eps[ii]
            if two_pt_deriv_test:
                pwork[ii] = p0[ii] + 2*eps[ii]
                fpp = func(pwork, *args)
                grad[ii] = (4*fp - fpp - 3*fm)/(2*eps[ii])
    return grad

get_hess(func, p0, eps, args=())

Calculate Hessian matrix of partial second derivatives. Hij = dfunc/(dp_i dp_j)

Parameters:

Name Type Description Default
func func

Model function

required
p0 list[float]

Parameter values to take derivative around

required
eps list[float]

Fractional stepsize to use when taking finite-difference derivatives Note that if eps*param is < 1e-6, then the step size for that parameter will simply be eps, to avoid numerical issues with small parameter perturbations.

required
args list or tuple

Additional arguments to func

()
Source code in dadi/Godambe.py
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
def get_hess(func, p0, eps, args=()):
    """
    Calculate Hessian matrix of partial second derivatives. 
    Hij = dfunc/(dp_i dp_j)

    Args:
        func (func): Model function
        p0 (list[float]): Parameter values to take derivative around
        eps (list[float]): Fractional stepsize to use when taking finite-difference derivatives
            Note that if eps*param is < 1e-6, then the step size for that parameter
            will simply be eps, to avoid numerical issues with small parameter
            perturbations.
        args (list or tuple): Additional arguments to func
    """
    # Calculate step sizes for finite-differences.
    eps_in = eps
    eps = numpy.empty([len(p0)])
    one_sided = [False]*len(p0)
    for i, pval in enumerate(p0):
        if pval != 0:
            # Account for floating point arithmetic issues
            if pval*eps_in < 1e-6:
                eps[i] = eps_in
                one_sided[i] = True
            else:
                eps[i] = eps_in*pval
        else:
            # Account for parameters equal to zero
            eps[i] = eps_in

    f0 = func(p0, *args)
    hess = numpy.empty((len(p0), len(p0)))
    for ii in range(len(p0)):
        for jj in range(ii, len(p0)):
            hess[ii][jj] = hessian_elem(func, f0, p0, ii, jj, eps, args=args, one_sided=one_sided)
            hess[jj][ii] = hess[ii][jj]
    return hess

hessian_elem(func, f0, p0, ii, jj, eps, args=(), one_sided=None)

Calculate element [ii][jj] of the Hessian matrix, a matrix of partial second derivatives w.r.t. to parameters ii and jj

Parameters:

Name Type Description Default
func func

Model function

required
f0 array - like

Evaluation of func at p0

required
p0 list[float]

Parameters for func

required
ii int

Index of matrix row

required
jj int

Index of matrix column

required
eps list[float]

List of absolute step sizes to use for each parameter when taking finite differences.

required
args list or tuple

Additional arguments to func

()
one_sided list[bool]

Optionally, pass in a sequence of length p0 that determines whether a one-sided derivative will be used for each parameter.

None
Source code in dadi/Godambe.py
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
def hessian_elem(func, f0, p0, ii, jj, eps, args=(), one_sided=None):
    """
    Calculate element [ii][jj] of the Hessian matrix, a matrix
    of partial second derivatives w.r.t. to parameters ii and jj

    Args:
        func (func): Model function
        f0 (array-like): Evaluation of func at p0
        p0 (list[float]): Parameters for func
        ii (int): Index of matrix row
        jj (int): Index of matrix column
        eps (list[float]): List of absolute step sizes to use for each parameter when taking
            finite differences.
        args (list or tuple): Additional arguments to func
        one_sided (list[bool]): Optionally, pass in a sequence of length p0 that determines
                whether a one-sided derivative will be used for each parameter.
    """
    # Note that we need to specify dtype=float, to avoid this being an integer
    # array which will silently fail when adding fractional eps.
    if one_sided is None:
        one_sided = [False]*len(p0)

    pwork = numpy.array(p0, copy=True, dtype=float)
    if ii == jj:
        if pwork[ii] != 0 and not one_sided[ii]:
            pwork[ii] = p0[ii] + eps[ii]
            fp = func(pwork, *args)

            pwork[ii] = p0[ii] - eps[ii]
            fm = func(pwork, *args)

            element = (fp - 2*f0 + fm)/eps[ii]**2
        else:
            pwork[ii] = p0[ii] + 2*eps[ii]
            fpp = func(pwork, *args)

            pwork[ii] = p0[ii] + eps[ii]
            fp = func(pwork, *args)

            element = (fpp - 2*fp + f0)/eps[ii]**2
    else:
        if pwork[ii] != 0 and pwork[jj] != 0 and not one_sided[ii] and not one_sided[jj]:
            # f(xi + hi, xj + h)
            pwork[ii] = p0[ii] + eps[ii]
            pwork[jj] = p0[jj] + eps[jj]
            fpp = func(pwork, *args)

            # f(xi + hi, xj - hj)
            pwork[ii] = p0[ii] + eps[ii]
            pwork[jj] = p0[jj] - eps[jj]
            fpm = func(pwork, *args)

            # f(xi - hi, xj + hj)
            pwork[ii] = p0[ii] - eps[ii]
            pwork[jj] = p0[jj] + eps[jj]
            fmp = func(pwork, *args)

            # f(xi - hi, xj - hj)
            pwork[ii] = p0[ii] - eps[ii]
            pwork[jj] = p0[jj] - eps[jj]
            fmm = func(pwork, *args)

            element = (fpp - fpm - fmp + fmm)/(4 * eps[ii]*eps[jj])
        else:
            # f(xi + hi, xj + h)
            pwork[ii] = p0[ii] + eps[ii]
            pwork[jj] = p0[jj] + eps[jj]
            fpp = func(pwork, *args)

            # f(xi + hi, xj)
            pwork[ii] = p0[ii] + eps[ii]
            pwork[jj] = p0[jj]
            fpm = func(pwork, *args)

            # f(xi, xj + hj)
            pwork[ii] = p0[ii]
            pwork[jj] = p0[jj] + eps[jj]
            fmp = func(pwork, *args)

            element = (fpp - fpm - fmp + f0)/(eps[ii]*eps[jj])
    return element

score_stat(func_ex, grid_pts, all_boot, p0, data, nested_indices, multinom=True, eps=0.01, adj_and_org=False)

Calculate test stastic from score test

Parameters:

Name Type Description Default
func_ex func

Model function for complex model

required
grid_pts list[float]

Grid points to evaluate model function

required
all_boot list[Spectrum]

List of bootstrap frequency spectra

required
p0 list[float]

Best-fit parameters for the simple model, with nested parameter explicity defined. Although equal to values for simple model, should be in a list form that can be taken in by the complex model you'd like to evaluate.

required
data Spectrum

Original data frequency spectrum

required
nested_indices list[int]

List of positions of nested parameters in complex model parameter list

required
multinom bool

If True, assume model is defined without an explicit parameter for theta. Because uncertainty in theta must be accounted for to get correct uncertainties for other parameters, this function will automatically consider theta if multinom=True.

True
eps float

Fractional stepsize to use when taking finite-difference derivatives Note that if eps*param is < 1e-6, then the step size for that parameter will simply be eps, to avoid numerical issues with small parameter perturbations.

0.01
adj_and_org bool

If False, return only adjusted score statistic. If True, also return unadjusted statistic as second return value.

False
Source code in dadi/Godambe.py
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
def score_stat(func_ex, grid_pts, all_boot, p0, data, nested_indices,
               multinom=True, eps=0.01, adj_and_org=False):
    # XXX: Implement boot_theta_adjusts
    """
    Calculate test stastic from score test

    Args:
        func_ex (func): Model function for complex model
        grid_pts (list[float]): Grid points to evaluate model function
        all_boot (list[Spectrum]): List of bootstrap frequency spectra
        p0 (list[float]): Best-fit parameters for the simple model, with nested parameter
            explicity defined.  Although equal to values for simple model, should
            be in a list form that can be taken in by the complex model you'd like
            to evaluate.
        data (Spectrum): Original data frequency spectrum
        nested_indices (list[int]): List of positions of nested parameters in complex model
                        parameter list
        multinom (bool): If True, assume model is defined without an explicit parameter for
                theta. Because uncertainty in theta must be accounted for to get
                correct uncertainties for other parameters, this function will
                automatically consider theta if multinom=True.
        eps (float): Fractional stepsize to use when taking finite-difference derivatives
            Note that if eps*param is < 1e-6, then the step size for that parameter
            will simply be eps, to avoid numerical issues with small parameter
            perturbations.
        adj_and_org (bool): If False, return only adjusted score statistic. If True, also
                    return unadjusted statistic as second return value.
    """
    if multinom:
        func_multi = func_ex
        model = func_multi(p0, data.sample_sizes, grid_pts)
        theta_opt = Inference.optimal_sfs_scaling(model, data)
        p0 = list(p0) + [theta_opt]
        func_ex = lambda p, ns, pts: p[-1]*func_multi(p[:-1], ns, pts)

    # We only need to take derivatives with respect to the parameters in the
    # complex model that have been set to specified values in the simple model
    def diff_func(diff_params, ns, grid_pts):
        # diff_params argument is only the nested parameters. All the rest
        # should come from p0
        full_params = numpy.array(p0, copy=True, dtype=float)
        # Use numpy indexing to set relevant parameters
        full_params[nested_indices] = diff_params
        return func_ex(full_params, ns, grid_pts)

    p_nested = numpy.asarray(p0)[nested_indices]
    GIM, H, J, cU = get_godambe(diff_func, grid_pts, all_boot, p_nested, data, 
                                eps, log=False)

    score_org = numpy.dot(numpy.dot(numpy.transpose(cU),
                                    numpy.linalg.inv(H)),cU)[0,0]
    score_adj = numpy.dot(numpy.dot(numpy.transpose(cU),
                                    numpy.linalg.inv(J)),cU)[0,0]

    if adj_and_org:
        return score_adj, score_org
    return score_adj

sum_chi2_ppf(x, weights=(0, 1))

Percent point function (inverse of cdf) of weighted sum of chi^2 distributions.

Parameters:

Name Type Description Default
x array - like

Value(s) at which to evaluate ppf

required
weights array - like

Weights of chi^2 distributions, beginning with zero d.o.f. For example, weights=(0,1) is the normal chi^2 distribution with 1 d.o.f. For single parameters on the boundary, the correct distribution for the LRT is 0.5chi^2_0 + 0.5chi^2_1, which would be weights=(0.5,0.5).

(0, 1)
Source code in dadi/Godambe.py
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
def sum_chi2_ppf(x, weights=(0,1)):
    """
    Percent point function (inverse of cdf) of weighted sum of chi^2
    distributions.

    Args:
        x (array-like): Value(s) at which to evaluate ppf
        weights (array-like): Weights of chi^2 distributions, beginning with zero d.o.f.
                For example, weights=(0,1) is the normal chi^2 distribution with 1
                d.o.f. For single parameters on the boundary, the correct
                distribution for the LRT is 0.5*chi^2_0 + 0.5*chi^2_1, which would
                be weights=(0.5,0.5).
    """
    import scipy.stats.distributions as ssd
    # Ensure that weights are valid
    if abs(numpy.sum(weights) - 1) > 1e-6:
        raise ValueError('Weights must sum to 1.')
    # A little clunky, but we want to handle x = 0.5, and x = [2, 3, 4]
    # correctly. So if x is a scalar, we record that fact so we can return a
    # scalar on output.
    if numpy.isscalar(x):
        scalar_input = True
    # Convert x into an array, so we can index it easily.
    x = numpy.atleast_1d(x)
    # Calculate total cdf of all chi^2 dists with dof > 1.
    # (ssd.chi2.cdf(x,0) is always nan, so we avoid that.)
    cdf = numpy.sum([w*ssd.chi2.cdf(x, d+1) for (d, w)
                     in enumerate(weights[1:])], axis=0)
    # Add in contribution from 0 d.o.f.
    cdf[x > 0] += weights[0]
    # Convert to ppf
    ppf = 1-cdf

    if scalar_input:
        return ppf[0]
    else:
        return ppf