mousestyles.data package

Submodules

mousestyles.data.utils module

Data utilities.

mousestyles.data.utils.day_to_mouse_average(features, labels, num_strains=16, stdev=False, stderr=False)[source]

first three columns of labels are strain num, mouse number, day number features is an M x N matrix of mouse day x features

Returns:new data matrix with a mean and stdev/stderr for each mouse over mouse days
mousestyles.data.utils.idx_restrict_to_rectangles(TXY, rects=[(0, 0)], xlims=(0, 1), ylims=(0, 1), xbins=2, ybins=4, eps=0.01)[source]

given (3 x T) TXY with 0th row array of times [ASSUMED SORTED] and rows 1,2 are x,y coords

returns new interval array which is E minus those things occuring outside of given rectangle

mousestyles.data.utils.map_xbins_ybins_to_cage(rectangle=(0, 0), xbins=2, ybins=4, YLower=1.0, YUpper=43.0, XUpper=3.75, XLower=-16.25)[source]

converts a rectangle in xbins x ybins to corresponding rectangle in Cage coordinates

format is [[p1, p2], [p3, p4]] where pi = (cage_height_location, cage_length_location)

# ??? -chris ### THIS GIVES WRONG CAGE LOCATIONS for top bottom left right # # # xbins ybins do NOT reflect cage geometry perfectly

mousestyles.data.utils.mouse_to_strain_average(features, labels, num_strains=16, stdev=False, stderr=False)[source]

first two columns of M x N data matrix are strain num (1 - num_strains), mouse number other columns are features

Returns: new data matrix with a mean and stdev/stderr for each strain over mice

mousestyles.data.utils.pull_locom_tseries_subset(M, start_time=0, stop_time=300)[source]

given an (m x n) numpy array M where the 0th row is array of times [ASSUMED SORTED]

returns a new array (copy) that is a subset of M corresp to start_time, stop_time

returns [] if times are not in array

(the difficulty is that if mouse does not move nothing gets registered
so we should artificially create start_time, stop_time movement events at boundries)
mousestyles.data.utils.split_data_in_half_randomly(features, labels)[source]
given an array of the form:
features = M x A x B x C x ...

where M is the number of mouse days

and an array labels for this data of the form:
labels = M x 2

where labels[:, 0] are strain numbers and the labels[:, 1] are mice numbers

returns
bootstrap_data_1 = a random half of the mouse days bootstrap_labels_1 bootstrap_data_2 = the other half bootstrap_labels_2
mousestyles.data.utils.total_time_rectangle_bins(M, xlims=(0, 1), ylims=(0, 1), xbins=5, ybins=10)[source]

given an (3 x n) numpy array M where the 0th row is array of times [ASSUMED SORTED]

returns a new (xbins x ybins) array (copy) that contains PDF of location over time

Module contents

mousestyles.data.distances(strain, mouse, day, step=50)[source]

Return a numpy array object of project movement data for the specified combination of strain, mouse and day.

At regular timesteps, defined by the step parameter, compute the euclidian distance between the positions of the mouse at two consecutive times.

More specifically:

  • let delta_t be the step parameter.
  • let \(t_n\) be the sequance of non negative numbers such that \(t_0 = 0\) and \(t_(n+1) = t_n + delta_t\). The sequence is defined for all \(n\) such that \(n>=0\) and \(t_n <= time\) of the experiment
  • let \(d_n\) be the sequence of non negative numbers such that \(d_0 = 0\) and \(d_n\) equals the position of the mouse at a particular day at time \(t_n\). \(d_n\) is then defined on the same set of integers as the sequence \(t_n\).
  • The function returns the sequence \(d_n\).
Parameters:
  • strain (int) – nonnegative integer indicating the strain number
  • mouse (int) – nonnegative integer indicating the mouse number
  • day (int) – nonnegative integer indicating the day number
  • step (float) – positive float defining the time between two observations default corresponds to 1 second
Returns:

movement

Return type:

numpy array

Examples

>>> dist = distances(0, 0, 0, step=1e2)
mousestyles.data.distances_bymouse(strain, mouse, step=50, verbose=False)[source]

Aggregates ‘distances’ for all days of recorded data for one particular mouse.

More specifically:

  • let \(d^1,...,d^D\) be the sequence of distances for one particular mouse for days \(1\) to \(D\).
  • The function returns the concatenation of the \(d^i\).
Parameters:
  • strain (int) – nonnegative integer indicating the strain number
  • mouse (int) – nonnegative integer indicating the mouse number
  • step (float) – positive float defining the time between two observations default corresponds to 1 second
Returns:

movement

Return type:

numpy array

Examples

>>> dist = distances_bymouse(0, 0, step=1e2)
mousestyles.data.distances_bystrain(strain, step=50, verbose=False)[source]

Aggregates distances_bymouse for all mice in one given strain.

More specifically:

  • let \(d^1,...,d^M\) be the sequence of distances for one particular strain for mouses \(1\) to \(M\).
  • The function returns the sequence concatenation of the \(d^i\).
Parameters:
  • strain (int) – nonnegative integer indicating the strain number
  • step (float) – positive float defining the time between two observations default corresponds to 1 second
Returns:

movement

Return type:

numpy array

Examples

>>> dist = distances_bystrain(0, step=1e2)
mousestyles.data.load_all_features()[source]

Returns a (21131, 13) size pandas.DataFrame object corresponding to 9 features over each mouse’s 2-hour time bin. The first four columns index each mouses’s 2-hour bin:

Column 0: the strain of the mouse (0-15) Column 1: the mouse number (number depends on strain) Column 2: the day number (5-16) Column 3: the 2-hour time bin (e.g., value 4 corresponds to hours 4 to 6)

The remaining 9 columns are the computed features.

Returns:features_data_frame – A dataframe of computed features.
Return type:pandas.DataFrame
mousestyles.data.load_intervals(feature)[source]

Return a pandas.DataFrame object of project interval data for the specified feature.

There are 5 columns in the dataframe: strain: the strain number of the mouse mouse: the mouse number in its strain day: the day number start: the start time stop: the stop time

Parameters:feature ({"AS", "F", "IS", "M_AS", "M_IS", "W"}) –
Returns:intervals – All data of the specified feature as a dataframe
Return type:pandas.DataFrame

Examples

>>> AS = load_intervals('AS')
>>> IS = load_intervals('IS')
mousestyles.data.load_mouseday_features(features=None)[source]

Returns a (1921, 3+11*n) size pandas.DataFrame object corresponding to each 2-hour time bin of the n inputted features over each mouse. The first three columns index each mouse:

Column 0: the strain of the mouse (0-15) Column 1: the mouse number (number depends on strain) Column 2: the day number (5-16)

The remaining 3*n columns are the values for each 2-hour time bin of the n inputted features.

Parameters:features (list, optional) – A list of one or more features chosen from {“ASProbability”, “ASNumbers”, “ASDurations”, “Food”, “Water”, “Distance”, “ASFoodIntensity”, “ASWaterIntensity”, “MoveASIntensity”} Default all features when optional
Returns:features_data_frame – A dataframe of computed features.
Return type:pandas.DataFrame

Examples

>>> mouseday = load_mouseday_features()
>>> mouseday = load_mouseday_features(["Food"])
>>> mouseday = load_mouseday_features(["Food", "Water", "Distance"])
mousestyles.data.load_movement(strain, mouse, day)[source]

Return a pandas.DataFrame object of project movement data for the specified combination of strain, mouse and day.

There are 4 columns in the dataframe: t: Time coordinates (in seconds) x: X coordinates indicating the left-right position of the cage y: Y coordinates indicating the front-back position of the cage isHB: Boolean indicating whether the point is in the home base or not

Parameters:
  • strain (int) – nonnegative integer indicating the strain number
  • mouse (int) – nonnegative integer indicating the mouse number
  • day (int) – nonnegative integer indicating the day number
Returns:

movement – CT, CX, CY coordinates and home base status of the combination of strain, mouse and day

Return type:

pandas.DataFrame

Examples

>>> movement = load_movement(0, 0, 0)
>>> movement = load_movement(1, 2, 1)
mousestyles.data.load_movement_and_intervals(strain, mouse, day, features=[u'AS', u'F', u'IS', u'M_AS', u'M_IS', u'W'])[source]

Return a pandas.DataFrame object of project movement and interval data for the specified combination of strain, mouse and day.

There are 4 + len(features) columns in the dataframe: t: Time coordinates (in seconds) x: X coordinates indicating the left-right position of the cage y: Y coordinates indicating the front-back position of the cage isHB: Boolean indicating whether the point is in the home base or not Additonal columns taking their names from features: Boolean indicating whether the time point is in an interval of behavior of the given feature.

Parameters:
  • strain (int) – nonnegative integer indicating the strain number
  • mouse (int) – nonnegative integer indicating the mouse number
  • day (int) – nonnegative integer indicating the day number
  • features (list (or other iterable) of strings) – list of features from {“AS”, “F”, “IS”, “M_AS”, “M_IS”, “W”}
Returns:

movement – coordinates, home base status, and feature interval information for a given srain, mouse and day

Return type:

pandas.DataFrame CT, CX, CY

Examples

>>> m1 = load_movement(1, 1, 1)
>>> m2 = load_movement_and_intervals(1, 1, 1, []) # don't add any features
>>> np.all(m1 == m2)
True
>>> m3 = load_movement_and_intervals(1, 1, 1, ['AS'])
>>> m3.shape[1] == m1.shape[1] + 1 # adds one column
True
>>> m3.shape[0] == m1.shape[0] # same number of rows
True
>>> m3[29:32]
            t      x       y   isHB     AS
29  56448.333 -6.289  34.902  False  False
30  56448.653 -5.509  34.173   True   True
31  56449.273 -5.048  33.284   True   True
mousestyles.data.load_start_time_end_time(strain, mouse, day)[source]

Returns the start and end times recorded for the mouse-day. The first number indicates the number of seconds elapsed since midnight, the second number indicates when the cage is closed for cleaning. In other words, this is the interval for which all sensors are active.

Parameters:
  • strain (int) – nonnegative integer indicating the strain number
  • mouse (int) – nonnegative integer indicating the mouse number
  • day (int) – nonnegative integer indicating the day number
Returns:

times – the start time and end time

Return type:

a tuple of (float, float)

mousestyles.data.load_time_matrix_dynamics()[source]

Load the time matrix for dynamics pattern project