mousestyles package



mousestyles.GLRT_distribution module

mousestyles.GLRT_distribution.hypo_exp_null(strain, mouse, day, law_est=0, exp_est=0, seed=-1)[source]

Return the outcome from GLRT with null hypothesis law distribution.

This function also used the Generalized Likelihood Ratio Test to test goodness of fit: in other words, which distribution is more likely.

In this function, we choose the exponential distributin to be the null and powerlaw distribution to be the alternative. We derived the test statistics by theory and pluged in MLE as our estimation of best parameters.

After we calculated the paramters, we need to find the rejection region, critical value or pvalue. To get a more general test, we want to use pvalue, instead of critical value under certain significance level.

To find the p-value, we use simulation methods, and all random numbers are drawn from previous functions. Therefore, although p value should be a constant given data, it is not a constant in our function, if we did not set the seed.

In general, in this function, if the p value is too small, then we will reject the null, and we say exponential is not a better fit compared to exponential distribution.

  • strain (int) – the strain number of the mouse
  • mouse (int) – the mouse number in its strain
  • day (int) – the day number
  • law_est (double (optional)) – the estimated parameter in law distribution
  • exp_est (double (optional)) – the estimated parameter in exponential distribution

the probablity under null reject.

Return type:



>>> hypo_exp_null (0, 0, 0)
mousestyles.GLRT_distribution.hypo_powerLaw_null(strain, mouse, day, law_est=0, seed=-1)[source]

Return the outcome from GLRT with null hypothesis law distribution.

This function used the Generalized Likelihood Ratio Test to test the goodness of fit: in other words, which distribution is more likely.

In this function, we choose the powerLaw distributin to be the null and exponential distribution to be the alternative. We derived the test statistics by theory and pluged in MLE as our estimation of best parameters.

After we calculated the paramters, we need to find the rejection region, critical value or pvalue. To get a more general test, we want to use pvalue, instead of critical value under certain significance level.

To find the p-value, we use simulation methods, and all random numbers are drawn from previous functions. Therefore, although p value should be a constant given data, it is not a constant in our function, if we did not set the seed.

In general, in this function, if the p value is too small, then we will reject the null, and we say powerlaw is not a better fit compared to exponential distribution.

  • strain (int) – the strain number of the mouse
  • mouse (int) – the mouse number in its strain
  • day (int) – the day number
  • law_est (double (optional)) – the estimated parameter in law distribution

the probablity under null reject.

Return type:



>>> hypo_law_null (0, 0, 0)
mousestyles.GLRT_distribution.random_exp(n, l, seed=-1)[source]

Random generate points of truncated exponential.

The method we generate is to use the memorylessness property of exponential distribution. As the survival function of exponential distribution is always the same, for truncated exponential distribution, it is just the same to draw from regular exponential distribtion and shift the truncated value.

  • n (int) – number of points
  • l (int) – exponential parameter lambda

points – n points have the target distribution.

Return type:

a vector of float number


>>> random_exp(4,2)
array([ 1.07592496,  1.19789646,  1.19759663,  1.03993227])
mousestyles.GLRT_distribution.random_powerlaw(n, a, seed=-1)[source]

Random generate points of truncated power law.

The method we generate is to inverse Cumulative Density Function of truncated powerlaw function, and put random number draw from Unif[0,1]. The theory behind it is \(F^{-1}(U) \sim F\).

  • n (int) – number of points
  • a (int>1) – power law parameter alpha

points – n points have the target distribution.

Return type:

a vector of float number


>>> random_powerlaw(4,2)
array([  1.18097435,   1.04584078,   1.4650779 ,  36.03967524])

mousestyles.distribution module

mousestyles.distribution.exp_inverse_cdf(y, l)[source]

The inverse CDF function of truncated (at 1) exponential distribution

  • y (float in [0, 1], or a np.dnarray) – y in formula F^(-1)(y) = 1 - log(1 - y) / l
  • l (float > 0) – a in formula F^(-1)(y) = 1 - log(1 - y) / l

x – The inverse CDF function of truncated (at 1) exponential distribution distribution with parameter l at point y

Return type:



>>> exp_inverse_cdf(0.6,2)
mousestyles.distribution.exp_pdf(x, l)[source]

The probability density function of truncated exponential.

  • x (float, or a np.dnarray) – x in formula p(x)=lambda*exp(-lambda*x).
  • l (float) – lambda in formula p(x)=lambda*exp(-lambda*x).

probability density – The probability density of power law at x.

Return type:



>>> exp_pdf(1, 1)
mousestyles.distribution.powerlaw_inverse_cdf(y, a)[source]

The inverse CDF function of power law distribution

  • y (float in [0, 1], or a np.dnarray) – y in formula F^(-1)(y) = (1 - y)^(1/(1-a))
  • a (float > 1) – a in formula F^(-1)(y) = (1 - y)^(1/(1-a))

x – The inverse CDF function of power law distribution with parameter a at point y

Return type:



>>> powerlaw_inverse_cdf(0.5, 5)
mousestyles.distribution.powerlaw_pdf(x, a)[source]

The probability density function of truncated power law.

  • x (float > 0, or a np.dnarray) – x in formula p(x)=(alpha-1)*x^(-alpha).
  • a (float > 1) – alpha in formula p(x)=(alpha-1)*x^(-alpha).

probability density – The probability density of power law at x.

Return type:



>>> powerlaw_pdf (2, 2)

mousestyles.est_power_param module


Return the estimators of truncated power law and exponential for each mouse day.

Returns:estimator – The estimator of truncated exponential distribution.
Return type:a float number


>>> fit()
mousestyles.est_power_param.fit_exponential(strain, mouse, day)[source]

Return the estimator of truncated exponential.

  • strain (int) – the strain number of the mouse
  • mouse (int) – the mouse number in its strain
  • day (int) – the day number

estimator – The estimator of truncated exponential distribution.

Return type:

a float number


>>> fit_exponential (0, 0, 0)
mousestyles.est_power_param.fit_powerlaw(strain, mouse, day)[source]

Return the estimator of truncated power law.

  • strain (int) – the strain number of the mouse
  • mouse (int) – the mouse number in its strain
  • day (int) – the day number

estimator – The estimator of truncated power law.

Return type:

a float number


>>> fit_powerlaw (0, 0, 0)
mousestyles.est_power_param.getdistance(strain, mouse, day)[source]

Return the distance of each two consecutive points among coordinates which is bigger than 1cm(truncated).

  • strain (int) –
  • strain number of the mouse (the) –
  • mouse (int) –
  • mouse number in its strain (the) –
  • day (int) –
  • day number (the) –

  • cut_dist (an array of number)
  • The vector of truncated distance.


>>> getdistance (0, 0, 0)
array([ 1.00648944,  1.02094319,  1.0178885 , ...,  1.00099351,
1.01191156,  1.00423354])

mousestyles.intervals module

Finite Union of Intervals

Implements a class to handle finite (disjoint) unions of intervals.

  • assumes that intervals are always closed and that the union is disjoint
  • open intervals remaining at the end of any operations (eg. complement)
  • are always made closed. e.g. [0,1]^C = [-np.inf,0] [1,np.inf]
  • end intervals being unbounded is handled using -np.inf and np.inf
  • does some okay handling for point intervals [a,a]

Darren Rhea, 2012; Chris Hillar revised, April 30, 2013; Ram Mehta revised, 2013; Copyright (c) 2013, All rights reserved; Chris Hillar revised, 2015

class mousestyles.intervals.Intervals(intervals=None)[source]

Bases: object

Finite Union of Intervals [ai,bi] backed by sorted lists.

intervals: (M x 2) numpy np.double array

returns new object of Active States given self as Events


returns new object of Inactive States given self as Events


New Intervals object which is the complement of self.


connects consecutive intervals separated by lengths <= eps


Returns a new object with gaps connected when rule returns True.

rule: Callable that takes parameters start_time and end_time.

Check if x is in the Finite Union of Intervals.

index_of_first_intersection(x, find_nearest=False)[source]

finds interval nearest to given number x and containing x if find_nearest=False: doesn’t require x to be in the interval


New Intervals object which is the intersection of self and Intervals F.

intersect_with_interval(a, b)[source]

returns (not a copy) Intervals object which is the intersection of self and [a, b] (faster than intersect)


returns a new Intervals object with only intervals containing elements of arr (NOTE: arr is assumed sorted)


Removes intervals with lengths <= eps.


New Intervals object which is the union of self and Intervals F.

mousestyles.intervals.binary_from_intervals(intervals, length=None)[source]

From an intervals object produce a binary sequence of size length

mousestyles.intervals.intervals_from_binary(bin_array, times)[source]

Given a one dimensional bin_array of 0s and 1s, returns a Intervals object of times corresponding to consecutives 1s

mousestyles.intervals.timestamps_to_interval(array, eps=0.01)[source]

given a 1D array with event timestamps, returns an interval centered on timestamp and eps wide. default 0.01 is half of minimum HCM sampling rate

mousestyles.kde module

mousestyles.kde.kde(x, x_grid, symmetric_correction=False, cutoff=1)[source]

Return a numpy.ndarray object of estimated density

  • x (numpy.ndarray) – data, as realiztions of variable X
  • x_grid (numpy.ndarray) – the grid points for the estimated density
  • symmetric_correction (boolean) – a method indicator. If False, do common gaussian kernel density estimation (kde). If True, do common gaussian kde on data generated from x concatenating with its reflection around the cutoff point. Then transform the estimated kde back by a factor of 2. Used for e.g. kde for nonnegative kernel estimation
  • cutoff (float) – the axis of symmetry for symmetric correction

pdf – estimated density at the specified grid points x_grid

Return type:



>>> kde(x = np.array([2,3,1,0]), x_grid=np.linspace(0, 5, 10))
array([ 0.17483395,  0.21599529,  0.23685855,  0.24007961,  0.22670763,
    0.19365019,  0.14228937,  0.08552725,  0.04043597,  0.01463953])
>>> x1 = np.concatenate([norm(-1, 1.).rvs(400), norm(1, 0.3).rvs(100)])
>>> pdf1 = kde(x=x1, x_grid=np.linspace(0, 5, 100), symmetric_correction
               =True, cutoff=1)
array([ 0.26625297,  0.26818492,  0.27105849,  0.27489486,  0.27968752, ...
    0.07764054,  0.07239964,  0.06736559,  0.06254175,  0.05793043])

mousestyles.mww module

mousestyles.mww.MWW_allmice(step=50, verbose=False)[source]

Aggregates MWW_mice data for all available strains of mice.

  • step (time interval length used to compute distances. Default is 1s.) – See data.distances_bymouse for more information.
  • verbose (boolean) –

  • mww_values (MWW_mice outputs for each strain.) – mww_values[i] corresponds to the ith strain.
  • Examples
  • ———
  • >>> mww_values = MWW_allmice()

mousestyles.mww.MWW_mice(strain, step=50, verbose=False)[source]

Compare distributions of distances among mice of the same strain. Use p-values of the Mann-Whitney U test.

  • strain (integer) – Number of the strain.
  • step (floeat) – Time interval length used to compute distances. Default is 1s. See data.distances_bymouse for more information.
  • verbose (boolean) –

  • cor (pvalues of the Mann-Whitney U test for each couple of distances) – samples among mice of the corresponding strain.
  • Examples
  • ———
  • >>> cor = MWW_mice(0)

mousestyles.mww.MWW_strains(step=50, verbose=False)[source]

Compare distributions of distances among strains. Proceed as if the mice in each strain are i.i.d. samples, and compare the p-values of the Mann-Whitney U test.

  • step (time interval length used to compute distances. Default is 1s.) – See data.distances_bymouse for more information.
  • verbose (boolean) –

  • cor (pvalues of the Mann-Whitney U test for each couple of distances) – samples among strains of mice.
  • Examples
  • ———
  • >>> cor = MWW_strains()


This function takes a bunch of sampled distributions and compute the p-values of the two sided Mann Whitney U test for each couple of samples.

The Mann-Whitney U test is a test for assessing whether two independent samples come from the same distribution. The null hypothesis for this test is that the two groups have the same distribution, while the alternative hypothesis is that one group has larger (or smaller) values than the other.

Null hypothesis \(H_0$: $P(X>Y)=P(Y>X)\). Alternative \(H_1$: not $H_0\).

The Mann-Whitney U test is similar to the Wilcoxon test, but can be used to compare multiple samples that aren’t necessarily paired.

Parameters:m (list of numpy arrays) – Sampled distributions.

Plot the p-values outputed by the Mann-Whitney U test using a correlation matrix representation.

Parameters:data (MWW_allmice output) –
  • plot (correlation matrix)
  • Examples
  • ——–
  • >>> strains = MWW_strains()
  • >>> plot_cor(strains)

Vectorized version of plot_cor. Plot several correlation matrices side by side using plot_cor.

Parameters:data (MWW_allmice output) –
  • plot (correlation matrix)
  • Examples
  • ——–
  • >>> allmice = MWW_allmice()
  • >>> plot_cor_multi(allmice)

Module contents

mousestyles is the final project for UC Berkeley’s 2016 Masters Capstone Project class. It is based on ideas, code, and data from the Tecott Lab at UCSF.