mousestyles package¶
Subpackages¶
- mousestyles.behavior package
- mousestyles.classification package
- mousestyles.data package
- mousestyles.distributions package
- mousestyles.dynamics package
- mousestyles.path_diversity package
- Submodules
- mousestyles.path_diversity.clean_movements module
- mousestyles.path_diversity.detect_noise module
- mousestyles.path_diversity.filter_path module
- mousestyles.path_diversity.get_dist_speed module
- mousestyles.path_diversity.path_features module
- mousestyles.path_diversity.path_features_advanced module
- mousestyles.path_diversity.path_index module
- mousestyles.path_diversity.smooth_noise module
- Module contents
- mousestyles.ultradian package
- mousestyles.visualization package
- Submodules
- mousestyles.visualization.distribution_plot module
- mousestyles.visualization.dynamics module
- mousestyles.visualization.path_diversity_plotting module
- mousestyles.visualization.plot_classification module
- mousestyles.visualization.plot_clustering module
- mousestyles.visualization.plot_lomb_scargle module
- mousestyles.visualization.plot_path module
- mousestyles.visualization.plot_ultradian module
- Module contents
Submodules¶
mousestyles.GLRT_distribution module¶
-
mousestyles.GLRT_distribution.
hypo_exp_null
(strain, mouse, day, law_est=0, exp_est=0, seed=-1)[source]¶ Return the outcome from GLRT with null hypothesis law distribution.
This function also used the Generalized Likelihood Ratio Test to test goodness of fit: in other words, which distribution is more likely.
In this function, we choose the exponential distributin to be the null and powerlaw distribution to be the alternative. We derived the test statistics by theory and pluged in MLE as our estimation of best parameters.
After we calculated the paramters, we need to find the rejection region, critical value or pvalue. To get a more general test, we want to use pvalue, instead of critical value under certain significance level.
To find the p-value, we use simulation methods, and all random numbers are drawn from previous functions. Therefore, although p value should be a constant given data, it is not a constant in our function, if we did not set the seed.
In general, in this function, if the p value is too small, then we will reject the null, and we say exponential is not a better fit compared to exponential distribution.
Parameters: - strain (int) – the strain number of the mouse
- mouse (int) – the mouse number in its strain
- day (int) – the day number
- law_est (double (optional)) – the estimated parameter in law distribution
- exp_est (double (optional)) – the estimated parameter in exponential distribution
Returns: the probablity under null reject.
Return type: p_value
Examples
>>> hypo_exp_null (0, 0, 0) 1.0
-
mousestyles.GLRT_distribution.
hypo_powerLaw_null
(strain, mouse, day, law_est=0, seed=-1)[source]¶ Return the outcome from GLRT with null hypothesis law distribution.
This function used the Generalized Likelihood Ratio Test to test the goodness of fit: in other words, which distribution is more likely.
In this function, we choose the powerLaw distributin to be the null and exponential distribution to be the alternative. We derived the test statistics by theory and pluged in MLE as our estimation of best parameters.
After we calculated the paramters, we need to find the rejection region, critical value or pvalue. To get a more general test, we want to use pvalue, instead of critical value under certain significance level.
To find the p-value, we use simulation methods, and all random numbers are drawn from previous functions. Therefore, although p value should be a constant given data, it is not a constant in our function, if we did not set the seed.
In general, in this function, if the p value is too small, then we will reject the null, and we say powerlaw is not a better fit compared to exponential distribution.
Parameters: - strain (int) – the strain number of the mouse
- mouse (int) – the mouse number in its strain
- day (int) – the day number
- law_est (double (optional)) – the estimated parameter in law distribution
Returns: the probablity under null reject.
Return type: p_value
Examples
>>> hypo_law_null (0, 0, 0) 0.0070000000000000001
-
mousestyles.GLRT_distribution.
random_exp
(n, l, seed=-1)[source]¶ Random generate points of truncated exponential.
The method we generate is to use the memorylessness property of exponential distribution. As the survival function of exponential distribution is always the same, for truncated exponential distribution, it is just the same to draw from regular exponential distribtion and shift the truncated value.
Parameters: - n (int) – number of points
- l (int) – exponential parameter lambda
Returns: points – n points have the target distribution.
Return type: a vector of float number
Examples
>>> random_exp(4,2) array([ 1.07592496, 1.19789646, 1.19759663, 1.03993227])
-
mousestyles.GLRT_distribution.
random_powerlaw
(n, a, seed=-1)[source]¶ Random generate points of truncated power law.
The method we generate is to inverse Cumulative Density Function of truncated powerlaw function, and put random number draw from Unif[0,1]. The theory behind it is \(F^{-1}(U) \sim F\).
Parameters: - n (int) – number of points
- a (int>1) – power law parameter alpha
Returns: points – n points have the target distribution.
Return type: a vector of float number
Examples
>>> random_powerlaw(4,2) array([ 1.18097435, 1.04584078, 1.4650779 , 36.03967524])
mousestyles.distribution module¶
-
mousestyles.distribution.
exp_inverse_cdf
(y, l)[source]¶ The inverse CDF function of truncated (at 1) exponential distribution
Parameters: - y (float in [0, 1], or a np.dnarray) – y in formula F^(-1)(y) = 1 - log(1 - y) / l
- l (float > 0) – a in formula F^(-1)(y) = 1 - log(1 - y) / l
Returns: x – The inverse CDF function of truncated (at 1) exponential distribution distribution with parameter l at point y
Return type: float
Examples
>>> exp_inverse_cdf(0.6,2) 1.4581453659370776
-
mousestyles.distribution.
exp_pdf
(x, l)[source]¶ The probability density function of truncated exponential.
Parameters: - x (float, or a np.dnarray) – x in formula p(x)=lambda*exp(-lambda*x).
- l (float) – lambda in formula p(x)=lambda*exp(-lambda*x).
Returns: probability density – The probability density of power law at x.
Return type: float
Examples
>>> exp_pdf(1, 1) 0.36787944117144233
-
mousestyles.distribution.
powerlaw_inverse_cdf
(y, a)[source]¶ The inverse CDF function of power law distribution
Parameters: - y (float in [0, 1], or a np.dnarray) – y in formula F^(-1)(y) = (1 - y)^(1/(1-a))
- a (float > 1) – a in formula F^(-1)(y) = (1 - y)^(1/(1-a))
Returns: x – The inverse CDF function of power law distribution with parameter a at point y
Return type: float
Examples
>>> powerlaw_inverse_cdf(0.5, 5) 1.189207115002721
-
mousestyles.distribution.
powerlaw_pdf
(x, a)[source]¶ The probability density function of truncated power law.
Parameters: - x (float > 0, or a np.dnarray) – x in formula p(x)=(alpha-1)*x^(-alpha).
- a (float > 1) – alpha in formula p(x)=(alpha-1)*x^(-alpha).
Returns: probability density – The probability density of power law at x.
Return type: float
Examples
>>> powerlaw_pdf (2, 2) 0.25
mousestyles.est_power_param module¶
-
mousestyles.est_power_param.
fit_dist_all
()[source]¶ Return the estimators of truncated power law and exponential for each mouse day.
Returns: estimator – The estimator of truncated exponential distribution. Return type: a float number Examples
>>> fit() 7.385844980814098
-
mousestyles.est_power_param.
fit_exponential
(strain, mouse, day)[source]¶ Return the estimator of truncated exponential.
Parameters: - strain (int) – the strain number of the mouse
- mouse (int) – the mouse number in its strain
- day (int) – the day number
Returns: estimator – The estimator of truncated exponential distribution.
Return type: a float number
Examples
>>> fit_exponential (0, 0, 0) 7.385844980814098
-
mousestyles.est_power_param.
fit_powerlaw
(strain, mouse, day)[source]¶ Return the estimator of truncated power law.
Parameters: - strain (int) – the strain number of the mouse
- mouse (int) – the mouse number in its strain
- day (int) – the day number
Returns: estimator – The estimator of truncated power law.
Return type: a float number
Examples
>>> fit_powerlaw (0, 0, 0) 9.4748705008269827
-
mousestyles.est_power_param.
getdistance
(strain, mouse, day)[source]¶ Return the distance of each two consecutive points among coordinates which is bigger than 1cm(truncated).
Parameters: - strain (int) –
- strain number of the mouse (the) –
- mouse (int) –
- mouse number in its strain (the) –
- day (int) –
- day number (the) –
Returns: - cut_dist (an array of number)
- The vector of truncated distance.
Examples
>>> getdistance (0, 0, 0) array([ 1.00648944, 1.02094319, 1.0178885 , ..., 1.00099351, 1.01191156, 1.00423354])
mousestyles.intervals module¶
Finite Union of Intervals
Implements a class to handle finite (disjoint) unions of intervals.
- assumes that intervals are always closed and that the union is disjoint
- open intervals remaining at the end of any operations (eg. complement)
- are always made closed. e.g. [0,1]^C = [-np.inf,0] [1,np.inf]
- end intervals being unbounded is handled using -np.inf and np.inf
- does some okay handling for point intervals [a,a]
Darren Rhea, 2012; Chris Hillar revised, April 30, 2013; Ram Mehta revised, 2013; Copyright (c) 2013, All rights reserved; Chris Hillar revised, 2015
-
class
mousestyles.intervals.
Intervals
(intervals=None)[source]¶ Bases:
object
Finite Union of Intervals [ai,bi] backed by sorted lists.
- parameters
- intervals: (M x 2) numpy np.double array
-
connect_gaps_by_rule
(rule)[source]¶ Returns a new object with gaps connected when rule returns True.
- Parameters
- rule: Callable that takes parameters start_time and end_time.
-
index_of_first_intersection
(x, find_nearest=False)[source]¶ finds interval nearest to given number x and containing x if find_nearest=False: doesn’t require x to be in the interval
-
intersect_with_interval
(a, b)[source]¶ returns (not a copy) Intervals object which is the intersection of self and [a, b] (faster than intersect)
-
mousestyles.intervals.
binary_from_intervals
(intervals, length=None)[source]¶ From an intervals object produce a binary sequence of size length
mousestyles.kde module¶
-
mousestyles.kde.
kde
(x, x_grid, symmetric_correction=False, cutoff=1)[source]¶ Return a numpy.ndarray object of estimated density
Parameters: - x (numpy.ndarray) – data, as realiztions of variable X
- x_grid (numpy.ndarray) – the grid points for the estimated density
- symmetric_correction (boolean) – a method indicator. If False, do common gaussian kernel density estimation (kde). If True, do common gaussian kde on data generated from x concatenating with its reflection around the cutoff point. Then transform the estimated kde back by a factor of 2. Used for e.g. kde for nonnegative kernel estimation
- cutoff (float) – the axis of symmetry for symmetric correction
Returns: pdf – estimated density at the specified grid points x_grid
Return type: numpy.ndarray
Examples
>>> kde(x = np.array([2,3,1,0]), x_grid=np.linspace(0, 5, 10)) array([ 0.17483395, 0.21599529, 0.23685855, 0.24007961, 0.22670763, 0.19365019, 0.14228937, 0.08552725, 0.04043597, 0.01463953]) >>> x1 = np.concatenate([norm(-1, 1.).rvs(400), norm(1, 0.3).rvs(100)]) >>> pdf1 = kde(x=x1, x_grid=np.linspace(0, 5, 100), symmetric_correction =True, cutoff=1) array([ 0.26625297, 0.26818492, 0.27105849, 0.27489486, 0.27968752, ... 0.07764054, 0.07239964, 0.06736559, 0.06254175, 0.05793043])
mousestyles.mww module¶
-
mousestyles.mww.
MWW_allmice
(step=50, verbose=False)[source]¶ Aggregates MWW_mice data for all available strains of mice.
Parameters: - step (time interval length used to compute distances. Default is 1s.) – See data.distances_bymouse for more information.
- verbose (boolean) –
Returns: - mww_values (MWW_mice outputs for each strain.) – mww_values[i] corresponds to the ith strain.
- Examples
- ———
- >>> mww_values = MWW_allmice()
-
mousestyles.mww.
MWW_mice
(strain, step=50, verbose=False)[source]¶ Compare distributions of distances among mice of the same strain. Use p-values of the Mann-Whitney U test.
Parameters: - strain (integer) – Number of the strain.
- step (floeat) – Time interval length used to compute distances. Default is 1s. See data.distances_bymouse for more information.
- verbose (boolean) –
Returns: - cor (pvalues of the Mann-Whitney U test for each couple of distances) – samples among mice of the corresponding strain.
- Examples
- ———
- >>> cor = MWW_mice(0)
-
mousestyles.mww.
MWW_strains
(step=50, verbose=False)[source]¶ Compare distributions of distances among strains. Proceed as if the mice in each strain are i.i.d. samples, and compare the p-values of the Mann-Whitney U test.
Parameters: - step (time interval length used to compute distances. Default is 1s.) – See data.distances_bymouse for more information.
- verbose (boolean) –
Returns: - cor (pvalues of the Mann-Whitney U test for each couple of distances) – samples among strains of mice.
- Examples
- ———
- >>> cor = MWW_strains()
-
mousestyles.mww.
get_pvalues
(m)[source]¶ This function takes a bunch of sampled distributions and compute the p-values of the two sided Mann Whitney U test for each couple of samples.
The Mann-Whitney U test is a test for assessing whether two independent samples come from the same distribution. The null hypothesis for this test is that the two groups have the same distribution, while the alternative hypothesis is that one group has larger (or smaller) values than the other.
Null hypothesis \(H_0$: $P(X>Y)=P(Y>X)\). Alternative \(H_1$: not $H_0\).
The Mann-Whitney U test is similar to the Wilcoxon test, but can be used to compare multiple samples that aren’t necessarily paired.
Parameters: m (list of numpy arrays) – Sampled distributions. Returns: - cor (2 dimensional array of pvalues.) – cor[i,j] is the p-value of the MWW test between the samples i and j.
- Notes
- ——
- A p-value < 0.05 means that there is strong evidence to reject the null
- hypothesis.
- References
- ———– –
- Mann-Whitney U test:
- http://tqmp.org/RegularArticles/vol04-1/p013/p013.pdf
- Non parametric tests
- http://www.mit.edu/~6.s085/notes/lecture5.pdf
- Examples
- ———
- >>> cor = get_pvalues([np.array([1, 2, 3]), np.array([1, 1, 2])])
Module contents¶
mousestyles
is the final project for UC Berkeley’s 2016 Masters Capstone
Project class. It is based on ideas, code, and data from the Tecott Lab
at UCSF.