mousestyles package¶

Subpackages¶

Submodules¶

mousestyles.GLRT_distribution module¶

mousestyles.GLRT_distribution.hypo_exp_null(strain, mouse, day, law_est=0, exp_est=0, seed=-1)[source]¶

Return the outcome from GLRT with null hypothesis law distribution.

This function also used the Generalized Likelihood Ratio Test to test goodness of fit: in other words, which distribution is more likely.

In this function, we choose the exponential distributin to be the null and powerlaw distribution to be the alternative. We derived the test statistics by theory and pluged in MLE as our estimation of best parameters.

After we calculated the paramters, we need to find the rejection region, critical value or pvalue. To get a more general test, we want to use pvalue, instead of critical value under certain significance level.

To find the p-value, we use simulation methods, and all random numbers are drawn from previous functions. Therefore, although p value should be a constant given data, it is not a constant in our function, if we did not set the seed.

In general, in this function, if the p value is too small, then we will reject the null, and we say exponential is not a better fit compared to exponential distribution.

Parameters:	strain (int) – the strain number of the mouse mouse (int) – the mouse number in its strain day (int) – the day number law_est (double (optional)) – the estimated parameter in law distribution exp_est (double (optional)) – the estimated parameter in exponential distribution
Returns:	the probablity under null reject.
Return type:	p_value

Examples

>>> hypo_exp_null (0, 0, 0)
1.0

mousestyles.GLRT_distribution.hypo_powerLaw_null(strain, mouse, day, law_est=0, seed=-1)[source]¶

Return the outcome from GLRT with null hypothesis law distribution.

This function used the Generalized Likelihood Ratio Test to test the goodness of fit: in other words, which distribution is more likely.

In this function, we choose the powerLaw distributin to be the null and exponential distribution to be the alternative. We derived the test statistics by theory and pluged in MLE as our estimation of best parameters.

After we calculated the paramters, we need to find the rejection region, critical value or pvalue. To get a more general test, we want to use pvalue, instead of critical value under certain significance level.

To find the p-value, we use simulation methods, and all random numbers are drawn from previous functions. Therefore, although p value should be a constant given data, it is not a constant in our function, if we did not set the seed.

In general, in this function, if the p value is too small, then we will reject the null, and we say powerlaw is not a better fit compared to exponential distribution.

Parameters:	strain (int) – the strain number of the mouse mouse (int) – the mouse number in its strain day (int) – the day number law_est (double (optional)) – the estimated parameter in law distribution
Returns:	the probablity under null reject.
Return type:	p_value

Examples

>>> hypo_law_null (0, 0, 0)
0.0070000000000000001

mousestyles.GLRT_distribution.random_exp(n, l, seed=-1)[source]¶

Random generate points of truncated exponential.

The method we generate is to use the memorylessness property of exponential distribution. As the survival function of exponential distribution is always the same, for truncated exponential distribution, it is just the same to draw from regular exponential distribtion and shift the truncated value.

Parameters:	n (int) – number of points l (int) – exponential parameter lambda
Returns:	points – n points have the target distribution.
Return type:	a vector of float number

Examples

>>> random_exp(4,2)
array([ 1.07592496,  1.19789646,  1.19759663,  1.03993227])

mousestyles.GLRT_distribution.random_powerlaw(n, a, seed=-1)[source]¶

Random generate points of truncated power law.

The method we generate is to inverse Cumulative Density Function of truncated powerlaw function, and put random number draw from Unif[0,1]. The theory behind it is $F^{-1}(U) \sim F$.

Parameters:	n (int) – number of points a (int>1) – power law parameter alpha
Returns:	points – n points have the target distribution.
Return type:	a vector of float number

Examples

>>> random_powerlaw(4,2)
array([  1.18097435,   1.04584078,   1.4650779 ,  36.03967524])

mousestyles.distribution module¶

mousestyles.distribution.exp_inverse_cdf(y, l)[source]¶

The inverse CDF function of truncated (at 1) exponential distribution

Parameters:	y (float in [0, 1], or a np.dnarray) – y in formula F^(-1)(y) = 1 - log(1 - y) / l l (float > 0) – a in formula F^(-1)(y) = 1 - log(1 - y) / l
Returns:	x – The inverse CDF function of truncated (at 1) exponential distribution distribution with parameter l at point y
Return type:	float

Examples

>>> exp_inverse_cdf(0.6,2)
1.4581453659370776

mousestyles.distribution.exp_pdf(x, l)[source]¶

The probability density function of truncated exponential.

Parameters:	x (float, or a np.dnarray) – x in formula p(x)=lambdaexp(-lambdax). l (float) – lambda in formula p(x)=lambdaexp(-lambdax).
Returns:	probability density – The probability density of power law at x.
Return type:	float

Examples

>>> exp_pdf(1, 1)
0.36787944117144233

mousestyles.distribution.powerlaw_inverse_cdf(y, a)[source]¶

The inverse CDF function of power law distribution

Parameters:	y (float in [0, 1], or a np.dnarray) – y in formula F^(-1)(y) = (1 - y)^(1/(1-a)) a (float > 1) – a in formula F^(-1)(y) = (1 - y)^(1/(1-a))
Returns:	x – The inverse CDF function of power law distribution with parameter a at point y
Return type:	float

Examples

>>> powerlaw_inverse_cdf(0.5, 5)
1.189207115002721

mousestyles.distribution.powerlaw_pdf(x, a)[source]¶

The probability density function of truncated power law.

Parameters:	x (float > 0, or a np.dnarray) – x in formula p(x)=(alpha-1)x^(-alpha). a (float > 1) – alpha in formula p(x)=(alpha-1)x^(-alpha).
Returns:	probability density – The probability density of power law at x.
Return type:	float

Examples

>>> powerlaw_pdf (2, 2)
0.25

mousestyles.est_power_param module¶

mousestyles.est_power_param.fit_dist_all()[source]¶

Return the estimators of truncated power law and exponential for each mouse day.

Returns:	estimator – The estimator of truncated exponential distribution.
Return type:	a float number

Examples

>>> fit()
7.385844980814098

mousestyles.est_power_param.fit_exponential(strain, mouse, day)[source]¶

Return the estimator of truncated exponential.

Parameters:	strain (int) – the strain number of the mouse mouse (int) – the mouse number in its strain day (int) – the day number
Returns:	estimator – The estimator of truncated exponential distribution.
Return type:	a float number

Examples

>>> fit_exponential (0, 0, 0)
7.385844980814098

mousestyles.est_power_param.fit_powerlaw(strain, mouse, day)[source]¶

Return the estimator of truncated power law.

Parameters:	strain (int) – the strain number of the mouse mouse (int) – the mouse number in its strain day (int) – the day number
Returns:	estimator – The estimator of truncated power law.
Return type:	a float number

Examples

>>> fit_powerlaw (0, 0, 0)
9.4748705008269827

mousestyles.est_power_param.getdistance(strain, mouse, day)[source]¶

Return the distance of each two consecutive points among coordinates which is bigger than 1cm(truncated).

Parameters:

strain (int) –
strain number of the mouse (the) –
mouse (int) –
mouse number in its strain (the) –
day (int) –
day number (the) –

Returns:

cut_dist (an array of number)
The vector of truncated distance.

Examples

>>> getdistance (0, 0, 0)
array([ 1.00648944,  1.02094319,  1.0178885 , ...,  1.00099351,
1.01191156,  1.00423354])

mousestyles.intervals module¶

Finite Union of Intervals

Implements a class to handle finite (disjoint) unions of intervals.

assumes that intervals are always closed and that the union is disjoint
open intervals remaining at the end of any operations (eg. complement)
are always made closed. e.g. [0,1]^C = [-np.inf,0] [1,np.inf]
end intervals being unbounded is handled using -np.inf and np.inf
does some okay handling for point intervals [a,a]

class mousestyles.intervals.Intervals(intervals=None)[source]¶

Bases: object

Finite Union of Intervals [ai,bi] backed by sorted lists.

parameters: intervals: (M x 2) numpy np.double array

ASs(ISDT=20)[source]¶: returns new object of Active States given self as Events

ISs(ISDT=20)[source]¶: returns new object of Inactive States given self as Events

complement()[source]¶: New Intervals object which is the complement of self.

connect_gaps(eps=0.001)[source]¶: connects consecutive intervals separated by lengths <= eps

connect_gaps_by_rule(rule)[source]¶

Returns a new object with gaps connected when rule returns True.

Parameters: rule: Callable that takes parameters start_time and end_time.

contains(x)[source]¶: Check if x is in the Finite Union of Intervals.

copy()[source]¶

index_of_first_intersection(x, find_nearest=False)[source]¶: finds interval nearest to given number x and containing x if find_nearest=False: doesn’t require x to be in the interval

intersect(F)[source]¶: New Intervals object which is the intersection of self and Intervals F.

intersect_with_interval(a, b)[source]¶: returns (not a copy) Intervals object which is the intersection of self and [a, b] (faster than intersect)

is_empty()[source]¶

load(filename=u'Intervals_save.npz')[source]¶

measure()[source]¶

num()[source]¶

remove(other)[source]¶

save(filename=u'Intervals_save')[source]¶

subordinate_to_array(arr)[source]¶: returns a new Intervals object with only intervals containing elements of arr (NOTE: arr is assumed sorted)

symmetric_difference(other)[source]¶

trim(eps=0.001)[source]¶: Removes intervals with lengths <= eps.

union(F)[source]¶: New Intervals object which is the union of self and Intervals F.

mousestyles.intervals.binary_from_intervals(intervals, length=None)[source]¶: From an intervals object produce a binary sequence of size length

mousestyles.intervals.intervals_from_binary(bin_array, times)[source]¶: Given a one dimensional bin_array of 0s and 1s, returns a Intervals object of times corresponding to consecutives 1s

mousestyles.intervals.timestamps_to_interval(array, eps=0.01)[source]¶: given a 1D array with event timestamps, returns an interval centered on timestamp and eps wide. default 0.01 is half of minimum HCM sampling rate

mousestyles.kde module¶

mousestyles.kde.kde(x, x_grid, symmetric_correction=False, cutoff=1)[source]¶

Return a numpy.ndarray object of estimated density

Parameters:	x (numpy.ndarray) – data, as realiztions of variable X x_grid (numpy.ndarray) – the grid points for the estimated density symmetric_correction (boolean) – a method indicator. If False, do common gaussian kernel density estimation (kde). If True, do common gaussian kde on data generated from x concatenating with its reflection around the cutoff point. Then transform the estimated kde back by a factor of 2. Used for e.g. kde for nonnegative kernel estimation cutoff (float) – the axis of symmetry for symmetric correction
Returns:	pdf – estimated density at the specified grid points x_grid
Return type:	numpy.ndarray

Examples

>>> kde(x = np.array([2,3,1,0]), x_grid=np.linspace(0, 5, 10))
array([ 0.17483395,  0.21599529,  0.23685855,  0.24007961,  0.22670763,
    0.19365019,  0.14228937,  0.08552725,  0.04043597,  0.01463953])
>>> x1 = np.concatenate([norm(-1, 1.).rvs(400), norm(1, 0.3).rvs(100)])
>>> pdf1 = kde(x=x1, x_grid=np.linspace(0, 5, 100), symmetric_correction
               =True, cutoff=1)
array([ 0.26625297,  0.26818492,  0.27105849,  0.27489486,  0.27968752, ...
    0.07764054,  0.07239964,  0.06736559,  0.06254175,  0.05793043])

mousestyles.mww module¶

mousestyles.mww.MWW_allmice(step=50, verbose=False)[source]¶

Aggregates MWW_mice data for all available strains of mice.

Parameters:

step (time interval length used to compute distances. Default is 1s.) – See data.distances_bymouse for more information.
verbose (boolean) –

Returns:

mww_values (MWW_mice outputs for each strain.) – mww_values[i] corresponds to the ith strain.
Examples
———
>>> mww_values = MWW_allmice()

mousestyles.mww.MWW_mice(strain, step=50, verbose=False)[source]¶

Compare distributions of distances among mice of the same strain. Use p-values of the Mann-Whitney U test.

Parameters:

strain (integer) – Number of the strain.
step (floeat) – Time interval length used to compute distances. Default is 1s. See data.distances_bymouse for more information.
verbose (boolean) –

Returns:

cor (pvalues of the Mann-Whitney U test for each couple of distances) – samples among mice of the corresponding strain.
Examples
———
>>> cor = MWW_mice(0)

mousestyles.mww.MWW_strains(step=50, verbose=False)[source]¶

Compare distributions of distances among strains. Proceed as if the mice in each strain are i.i.d. samples, and compare the p-values of the Mann-Whitney U test.

Parameters:

step (time interval length used to compute distances. Default is 1s.) – See data.distances_bymouse for more information.
verbose (boolean) –

Returns:

cor (pvalues of the Mann-Whitney U test for each couple of distances) – samples among strains of mice.
Examples
———
>>> cor = MWW_strains()

mousestyles.mww.get_pvalues(m)[source]¶

This function takes a bunch of sampled distributions and compute the p-values of the two sided Mann Whitney U test for each couple of samples.

The Mann-Whitney U test is a test for assessing whether two independent samples come from the same distribution. The null hypothesis for this test is that the two groups have the same distribution, while the alternative hypothesis is that one group has larger (or smaller) values than the other.

Null hypothesis $H_0$: $P(X>Y)=P(Y>X)$. Alternative $H_1$: not $H_0$.

The Mann-Whitney U test is similar to the Wilcoxon test, but can be used to compare multiple samples that aren’t necessarily paired.

Parameters: m (list of numpy arrays) – Sampled distributions.

Returns:

cor (2 dimensional array of pvalues.) – cor[i,j] is the p-value of the MWW test between the samples i and j.
Notes
——
A p-value < 0.05 means that there is strong evidence to reject the null
hypothesis.
References
———– –
1. Mann-Whitney U test:
  
  http://tqmp.org/RegularArticles/vol04-1/p013/p013.pdf
2. Non parametric tests
  
  http://www.mit.edu/~6.s085/notes/lecture5.pdf
Examples
———
>>> cor = get_pvalues([np.array([1, 2, 3]), np.array([1, 1, 2])])

mousestyles.mww.plot_cor(data)[source]¶

Plot the p-values outputed by the Mann-Whitney U test using a correlation matrix representation.

Parameters:	data (MWW_allmice output) –
Returns:	plot (correlation matrix) Examples ——– >>> strains = MWW_strains() >>> plot_cor(strains)

mousestyles.mww.plot_cor_multi(mww_values)[source]¶

Vectorized version of plot_cor. Plot several correlation matrices side by side using plot_cor.

Parameters:	data (MWW_allmice output) –
Returns:	plot (correlation matrix) Examples ——– >>> allmice = MWW_allmice() >>> plot_cor_multi(allmice)

Module contents¶

mousestyles is the final project for UC Berkeley’s 2016 Masters Capstone Project class. It is based on ideas, code, and data from the Tecott Lab at UCSF.

mousestyles package¶

Subpackages¶

Submodules¶

mousestyles.GLRT_distribution module¶

mousestyles.distribution module¶

mousestyles.est_power_param module¶

mousestyles.intervals module¶

mousestyles.kde module¶

mousestyles.mww module¶

Module contents¶

Table Of Contents

Related Topics

This Page