DataFrame API Reference

Conversion

spiketimes.df.conversion.df_to_list(df: pandas.core.frame.DataFrame, spiketimes_col: str = 'spiketimes', spiketrain_col: str = 'spiketrain')[source]

Convert a DataFrame of spiketimes to a list spiketrains.

Parameters
  • df – A pandas DataFrame of spiketimes indexed by spiketrains

  • spiketimes_col – The column containing spiketimes

  • spiketrain_col – The column containing spiketrain identifiers

Returns

spiketrain_IDs, spiketrain_list

spiketimes.df.conversion.df_to_list_of_dicts(df: pandas.core.frame.DataFrame, spiketimes_col: str = 'spiketimes', spiketrain_col: str = 'spiketrain')[source]

Convert a DataFrame of spiketrains to a list of dicts of spiketrains.

The dicts are of the form {spiketrain_id: spiketimes_array}

Parameters
  • df – A pandas DataFrame of spiketimes indexed by spiketrains

  • spiketimes_col – The column containing spiketimes

  • spiketrain_col – The column containing spiketrain identifiers

Returns

spiketimes_array}

Return type

A list of dictionarys of the form {spiketrain_id

spiketimes.df.conversion.list_of_dicts_to_df(list_of_dicts: list, returned_spiketimes_label: str = 'spiketimes', returned_spiketrain_label: str = 'spiketrain')[source]

Convert a list of named spiketrains to a dataframe.

Data must be in the format of [{“st_name”: spiketimes_arr}, etc]

Parameters
  • list_of_dicts – A list of dicts. The dict key is the spiketrain identifier. The dict value is a numpy array of spiketimes.

  • returned_spiketimes_label – The label of the column in the returned DataFrame containing spiketimes

  • returned_spiketrains_label – The label of the column in the returned DataFrame containing spiketrain identifiers

Returns

A pandas DataFrame containing spiketimes indexed by spiketrain.

spiketimes.df.conversion.list_to_df(spiketrains: list, indexes: list = None, returned_spiketimes_label: str = 'spiketimes', returned_spiketrain_label: str = 'spiketrain')[source]

Convert a list of spiketrains into a tidy dataframe of spiketimes

Parameters
  • spiketrains – A list of numpy-array spiketrains

  • indexes – An optional list of labels for the of the spiketrains

  • returned_spiketimes_label – The label of the column in the returned DataFrame containing spiketimes

  • returned_spiketrains_label – The label of the column in the returned DataFrame containing spiketrain identifiers

Returns

A pandas DataFrame containing one spike and id label per row

Simulating Spiketrains

spiketimes.df.simulate.homogeneous_poisson_processes(rate: float, t_stop: float, n: int, t_start: float = 0)[source]

Simulate n spiketrains as homogeneous poisson processes.

Each spiketrain has the same characteristics.

Parameters
  • rate – intensity of the poisson processes. How many events per second.

  • t_stop – the time after which sampling stops

  • n – the number of spiketrains to simulate

  • t_start – the time from which sampling starts

Returns

A pandas dataframe containing the simulated spiketrains with columns {“spiketrain”, “spiketimes”}

spiketimes.df.simulate.imhomogeneous_poisson_processes(time_rate: list, n: int, t_start: float = 0)[source]

Simulate n spiketrains as imhomgeneous poisson processes.

Each spiketrain has the same time-varying firing rates.

Parameters
  • time_rate – list of tuples containing the timespan and rate of each firing rate (time_span, firing_rate).

  • n – number of spiketrains to generate

  • t_start – if specified, starts the first time interval in time_rate from this time.

Returns

A pandas dataframe containing the simulated spiketrains with columns {“spiketrain”, “spiketimes”}

Generating Spiketrain Surrogates

spiketimes.df.surrogates.jitter_spiketrains(df: pandas.core.frame.DataFrame, jitter_window_size: float, spiketimes_col: str = 'spiketimes', n: int = 1, returned_surrogate_label: str = 'surrogate')[source]

Create multiple spiketime-jittered surrogates from a single parent spiketrain.

Given a dataframe containing a spiketimes from a neuron, returns a dataframe of n surrogate spiketrians by binning spike counts and randomly dispersing spiketimes within each timebin.

Parameters
  • df – dataframe containing the data

  • spiketimes_col – label of column containing spiketimes

  • jitter_window_size – binwidth in seconds used to bin spike counts

  • n – number of surrogate spiketrains to replicate

  • returned_surrogate_label – column label indicating surrogate identity

Returns

a pandas dataframe of spiketimes indexed by spiketrain

spiketimes.df.surrogates.jitter_spiketrains_by(df: pandas.core.frame.DataFrame, jitter_window_size: float, spiketimes_col: str = 'spiketimes', by_col: str = 'spiketrain', n: int = 1)[source]

Craete multiple shuffled-ISI surrogates for spiketrain in a dataframe.

Given a dataframe of spiketimes grouped by spiketrain another column, generates n surrogates from each spiketrain. Surrogates generated by generating spikecounts from the parent from the start point of the spiketrain untill the end, then generating surrogates with the same number of spikes in each time bin, but with spiketimes randomised.

Parameters
  • df – dataframe containing the data

  • jitter_window_size – binwidth in seconds used to bin spike counts

  • spiketimes_col – label of column containing spiketimes

  • by_col – label of column indicating group (e.g. neuron_id, spiketrain_id or trial_id)

  • n – number of surrogates to generate per group

Returns

A pandas dataframe containing surrogate spiketrain indexed by a ‘surrogate_replicate’ column

spiketimes.df.surrogates.shuffled_isi_spiketrains(df: pandas.core.frame.DataFrame, spiketimes_col: str = 'spiketimes', n: int = 1, returned_surrogate_label: str = 'surrogate')[source]

Create multiple shuffled-ISI surrogates from a single parent spiketrain.

Given a dataframe containing a spiketimes from a neuron, returns a dataframe of n surrogate spiketrians by shuffling inter-spike-intervals

Parameters
  • df – dataframe containing the data

  • spiketimes_col – label of column containing spiketimes

  • n – number of surrogate spiketrains to replicate

  • returned_surrogate_label – column label indicating surrogate identity

Returns

a pandas dataframe of spiketimes indexed by spiketrain

spiketimes.df.surrogates.shuffled_isi_spiketrains_by(df: pandas.core.frame.DataFrame, spiketimes_col: str = 'spiketimes', by_col: str = 'spiketrain', n: int = 1)[source]

Craete multiple shuffled-ISI surrogates for spiketrain in a dataframe.

Given a dataframe of spiketimes grouped by spiketrain another column, generates n surrogates from each spiketrain. Surrgates are generated by shuffling inter-spike-intervals.

Parameters
  • df – dataframe containing the data

  • spiketimes_col – label of column containing spiketimes

  • by_col – label of column indicating group (e.g. neuron_id, spiketrain_id or trial_id)

  • n – number of surrogates to generate per group

Returns

A pandas dataframe containing surrogate spiketrain indexed by a ‘surrogate_replicate’ column

Alignment

spiketimes.df.alignment.align_around(df: pandas.core.frame.DataFrame, data_colname: str, events: numpy.ndarray, max_latency: float = None, t_before: float = None, drop: bool = False)[source]

Aligns data to events.

Two dataframes should be passed: one containing the data to be aligned and the other containing the event to align to. All data is aligned to the same events. Use spiketimes.df.alignment.align_around_by to align by group.

Parameters
  • df – The dataframe containing the data to be aligned

  • data_colname – The column name in df to be aligned

  • events – A series or numpy array of event timestamps to align to

  • max_latency – If specified, any latencies above this will be returned as np.nan

  • t_before – The desired negative window before the onset of the event to align to

  • drop – Whether to drop np.nan values

Returns

A copy of df with an additional column aligned containing values in df[t_colname] aligned to events

spiketimes.df.alignment.align_around_by(df_data: pandas.core.frame.DataFrame, df_events: pandas.core.frame.DataFrame, df_data_data_colname: str = 'spiketimes', df_events_group_colname: str = 'session', df_data_group_colname: str = 'session', df_events_event_colname: str = 'spiketimes', max_latency: float = None, t_before: float = None)[source]

Align data to events. Align different datapoints to different events.

Aligns data in a data in a pandas dataframe (df_data) to events in an event df (df_events). Data is aligned to events sharing the same group. Useful when aligned data from different sessions to events from different sessions.

Parameters
  • df_data – A pandas DataFrame containing data to be aligned

  • df_data_data_colname – The label of the column in df_data containing the data to be aligned

  • df_data_group_colname – The label of the column in df_data containing group membership identifiers.

  • df_events – the df containing events to the data align to

  • df_events_event_colname – The label of column in df_events containing events

  • df_events_group_colname – The label of the column in df_events containing group membership identifiers (e.g. session id).

  • max_latency – If specified, any latencies above this will be returned as np.nan

  • t_before – The desired negative window before the onset of the event to align to

Returns

‘aligned’ containing data aligned to events.

Return type

A copy of df_data with an additional column

Binning

spiketimes.df.binning.binned_spiketrain(df: pandas.core.frame.DataFrame, spiketimes_col: str = 'spiketimes', spiketrain_col: str = 'spiketrain', fs: str = 1, t_start: float = None, t_stop: float = None)[source]

Get event counts by entity at regular a constant sampling rate.

Parameters
  • df – Pandas dataframe containing the data

  • fs – Desired sampling frequency in seconds

  • spiketimes_col – The label of the column in df containing spiketimes

  • spiketrain_col – The label of the column in df containing spiketrain identifiers.

  • t_start – The time after which the first bin will start. Default is 0.

  • t_stop – The maximum time for the time bins.

Returns

A pandas DataFrame containing the binned data. The time column contains the left edge of the time bin. spike_count contains the number of spikes occuring in that bin.

spiketimes.df.binning.binned_spiketrain_bins_provided(df: pandas.core.frame.DataFrame, bins: numpy.ndarray, spiketimes_col: str = 'spiketimes', spiketrain_col: str = 'spiketrain')[source]

Get event count per item in user-specified bins.

Designed to bin spiketrains but works on any set of events.

Parameters
  • df – A pandas DataFrame containing the data

  • bins – A numpy array of time bins

  • spiketimes_col – The label of the column in df containing spiketimes

  • spiketrain_col – The label of the column in df containing spiketrain identifiers.

Returns

A pandas DataFrame with columns indicating the unit (by_col), time bin and event counts.

spiketimes.df.binning.spike_count_around_event(df: pandas.core.frame.DataFrame, events: numpy.ndarray, binsize: float, spiketimes_col: str = 'spiketimes', by_col: str = 'spiketrain')[source]

Get spike counts for each neuron following events.

Parameters
  • df – A pandas DataFrame containing the spike data.

  • events – A numpy array of event timings.

  • binsize – The timeperiod after each event during which spikes are counted.

  • spiketimes_col – The label of the column in df containing spiketimes.

  • by_col – The label of the column in df containing spiketrain identifiers.

Returns

A pandas DataFrame with columns identifing the spiketrain, event and spikecounts.

spiketimes.df.binning.spike_count_around_event_by(df_data: pandas.core.frame.DataFrame, binsize: float, df_data_data_colname: str, df_data_group_colname: str, df_data_spiketrain_colname: str, df_events: pandas.core.frame.DataFrame, df_events_event_colname: str, df_events_group_colname: str)[source]

Get spike counts around events where you different sets of spiketrains and events.

Parameters
  • df_data – A pandas DataFrame containing the spike times

  • binsize – The duration of the period after each event during which spikes are counted

  • df_data_data_colname – The label of the column in df_data containing the spiketime data

  • df_data_group_colname – The label of the column in df_data containing the group data (e.g. session_id)

  • df_data_spiketrain_colname – The label of the column in df_data containg spiketrain ids

  • df_events – A pandas DataFrame containing event timings

  • df_events_event_colname – The label of the column in df_events containing event timings

  • df_events_group_colname – The label of the column in df_events containing group identifiers (e.g. session_id).

Returns

A pandas DataFrame with one row per event per spiketrain with columns identifying the event, spike counts, group and spiketrain.

spiketimes.df.binning.which_bin(df: pandas.core.frame.DataFrame, bin_edges: numpy.ndarray, allow_before: bool = False, max_latency: float = None, before: float = None, spiketimes_col: str = 'spiketimes')[source]

Returns the closest bin for each data element. Useful for asigning spikes to trials.

Parameters
  • df – A pandas DataFrame containing the data to be binned

  • bin_edges – A numpy array of edges to bin into.

  • before – If specified, the spiketrain is aligned to the bins shifts bins backwards by this quantity.

  • allow_before – If False, spikes occuring before the first time bin return np.nan

  • max_latency – If specified, np.nan is returned for any spikes occuring this quantity after the maximum bin_edge

  • spiketimes_col – The label of the column in df containing spiketimes

Returns

‘bin_values’ and ‘bin_idx’ containing the value and index in corresponding event array of the appropriate event.

Return type

A copy of the passed DataFrame with an additional two columns

spiketimes.df.binning.which_bin_by(df_data: pandas.core.frame.DataFrame, df_data_data_colname: str, df_data_group_colname: str, df_events: pandas.core.frame.DataFrame, df_events_event_colname: str, df_events_group_colname: str, max_latency: float = None, before: float = None, allow_before: bool = False)[source]

Get corresponding bin per data point. Searches bins by group.

Parameters
  • df_data – the df containing the data to be binned

  • df_data_data_colname – label of the column in df_data containing the data to be binned

  • df_data_group_colname – label of the column in df_data containing group membership identifiers. This could be session id, mouse id etc.

  • df_data_spiketrain_colname – label of the column in df_data containing spiketrain id (could also be event_type)

  • df_events – the df containing events to the data align to

  • df_events_event_colname – label of the column in df_events containing events

  • df_events_group_colname – label of the column in df_events containing group membership identifiers (e.g. session id).

  • max_latency – if specified, any latencies above this will be returned as np.nan

  • before – the desired negative window before the onset of the event to align to

  • allow_before – if true allows for negative idx

Returns

‘bin_values’ and ‘bin_idx’ containing the value and index in corresponding event array of the appropriate event.

Return type

A copy of df_data with an additional two columns

Statistics

spiketimes.df.statistics.auc_roc_test_by(df: pandas.core.frame.DataFrame, n_boot: int = 1000, return_distance_from_chance: bool = False, spikecount_col: str = 'spike_count', spiketrain_col: str = 'spiketrain', condition_col: str = 'cond')[source]

Calculates the Area Under the Receiver Operating Characteristic Curve of spike counts for each spiketrain.

The AUCROC can be used as a metric of the separability of two distrobutions. Each spiketrain must have been recorded in both conditions during multiple trials. Significance tested using a permutation test.

Parameters
  • df – A pandas DataFrame containing spiketimes indexed by spiketrain

  • n_boot – The number of permutation replicates to draw.

  • spikecount_col – The label of the column containing spikecounts

  • spiketrain_col – The label of the column identifying the spiketrain responsible for the spike

  • condition_col – A categorical column containing 0 for the baseline condition and 1 for the experimental condition

  • return_distance_from_chance – If True, returns distance from 0.5

Returns

A pandas DataFrame containing one row per spiketrain with columns {‘spiketrain’, ‘AUCROC’, ‘p’}

spiketimes.df.statistics.cv2_isi_by(df: pandas.core.frame.DataFrame, spiketimes_col: str = 'spiketimes', spiketrain_col: str = 'spiketrain')[source]

Calculate cv2 of interspike intervals of each spiketrain.

cv2 is a metric related to the coefficient of variation. It is adapted to be suitable long-period spiketrains.

Parameters
  • df – A pandas DataFrame containing spiketimes indexed by spiketrain

  • spiketimes_col – The label of the column containing spiketimes

  • spiketrain_col – The label of the column identifying the spiketrain responsible for the spike

Returns

A DataFrame containing cv2_isi by neuron

spiketimes.df.statistics.cv_isi_by(df: pandas.core.frame.DataFrame, spiketimes_col: str = 'spiketimes', spiketrain_col: str = 'spiketrain')[source]

Calculate the coefficient of variation of interspike intervals for each spiketrain in a DataFrame.

The cv_isi is a metric of spike regularity. Values near 1 are typical of poisson processes. Values near 0 indicate very regular processes.

Parameters
  • df – A pandas DataFrame containing spiketimes indexed by spiketrain

  • spiketimes_col – The label of the column containing spiketimes

  • spiketrain_col – The label of the column identifying the spiketrain responsible for the spike

Returns

A DataFrame containing cv_isi by neuron

spiketimes.df.statistics.diffmeans_test_by(df: pandas.core.frame.DataFrame, n_boot: int = 1000, spikecount_col: str = 'spike_count', spiketrain_col: str = 'spiketrain', condition_col: str = 'cond')[source]

Calculates the difference between means of spike counts for each spike in a data frame and also tests significance using a permutation test.

Parameters
  • df – A pandas DataFrame containing spiketimes indexed by spiketrain

  • n_boot – The number of permutation replicates to draw.

  • spikecount_col – The label of the column containing spikecounts

  • spiketrain_col – The label of the column identifying the spiketrain responsible for the spike

  • condition_col – A categorical column containing 0 for the baseline condition and 1 for the experimental condition

Returns

A pandas DataFrame containing one row per spiketrain with columns {‘spiketrain’, ‘diff_of_means’, ‘p’}

spiketimes.df.statistics.fraction_silent_by(df: pandas.core.frame.DataFrame, binsize: float = 1, silent_threshold: float = 0.5, spiketimes_col: str = 'spiketimes', spiketrain_col: str = 'spiketrain', t_start: float = None, t_stop: float = None)[source]

Estimate the fraction of time a spiketrain was inactivate.

Estimate calculated by binning spikes into time bins and calculating the proportion of spikes falling below a specified threshold.

Parameters
  • df – A pandas DataFrame containing spiketimes indexed by spiketrain

  • binsize – The time period in seconds to use when binning spikes.

  • spiketimes_col – The label of the column containing spiketimes

  • spiketrain_col – The label of the column identifying the spiketrain responsible for the spike

  • t_start – Time point at which to start. Defaults to time of first spike in df.

  • t_stop – Maximum timepoint. Defaults to last spike in df.

Returns

A pandas DataFrame containing fraction silent estimates by neuron.

spiketimes.df.statistics.ifr_by(df: pandas.core.frame.DataFrame, fs: float = 1, sigma: float = None, spiketimes_col: str = 'spiketimes', spiketrain_col: str = 'spiketrain', t_start: float = None, t_stop: float = None)[source]

Estimate firing rate for each spiketrain at a regular sampling rate.

Parameters
  • df – A pandas DataFrame containing the spikes data

  • fs – The sampling rate at which to estimate firing rate

  • sigma – Hypterparameter controlling smoothing for firing rate estimates

  • spiketimes_col – The label of the column in df containing spiketimes

  • spiketrain_col – The label of the column in df containing spiketrain idendifiers (which spiketrain was responsible for the spike)

  • t_start – Time point at which to start firing rate estimates. Defaults to time of first spike in df.

  • t_stop – Time point of maximum firing rate estimate. Defaults to last spike in df.

Returns

A pandas DataFrame with one row per timepoint per spiketrain with column ifr identifying firing rate estimates.

spiketimes.df.statistics.mean_firing_rate_by(df: pandas.core.frame.DataFrame, spiketimes_col: str = 'spiketimes', spiketrain_col: str = 'spiketrain', t_start: float = None, t_stop: float = None)[source]

Estimate the mean firing rate of each spiketrain.

Firing rate caluclated by summing spikes and dividing by total time.

Parameters
  • df – A pandas DataFrame containing spiketimes indexed by spiketrain

  • spiketimes_col – The label of the column containing spiketimes

  • spiketrain_col – The label of the column identifying the spiketrain responsible for the spike

  • t_start – Time point at which to start. Defaults to time of first spike in df.

  • t_stop – Maximum timepoint. Defaults to last spike in df.

Returns

A DataFrame containing mean firing rate by neuron

spiketimes.df.statistics.mean_firing_rate_ifr_by(df: pandas.core.frame.DataFrame, fs: float = 1, sigma: float = None, spiketimes_col: str = 'spiketimes', spiketrain_col: str = 'spiketrain', exclude_below: float = None, t_start: float = None, t_stop: float = None)[source]

Estimate mean firing rate of each neuron by first estimating firing rate at a regular interval and then taking the median.

Parameters
  • df – A pandas Dataframe containing the spike data

  • fs – The sampling rate at which to estimate firing rate

  • sigma – Parameter contolling smoothing level of firing rate estiamtes.

  • exclude_below – If specified, firing rates below this value will not be included in the median calculation.

  • spiketimes_col – The label of the column containing the spiketimes

  • spiketrain_col – The label of the column in df containing spiketrain idendifiers (which spiketrain was responsible for the spike)

  • t_start – Time point at which to start firing rate estimates. Defaults to time of first spike in df.

  • t_stop – Time point of maximum firing rate estimate. Defaults to last spike in df.

Returns

A pandas DataFrame containing one row per spiketrain as well as its firing rate estimate.

Correlations

spiketimes.df.correlate.auto_corr(df: pandas.core.frame.DataFrame, binsize: int = 0.01, num_lags: int = 100, spiketimes_col: str = 'spiketimes', spiketrain_col: str = 'spiketrain', t_start: float = None, t_stop: float = None)[source]

Calculate the autocorrelation function for each spiketrain in a DataFrame.

Parameters
  • df – A pandas DataFrame containing spiketimes indexed by spiketrain

  • binsize – The size of the time bin in seconds

  • num_lags – The number of lags forward and backwards around lag 0 to return

  • spiketimes_col – The label of the column containing spiketimes

  • spiketrain_col – The label of the column containing spiketrain identifiers

  • t_start – Minimum timepoint

  • t_stop – Maximum timepoint

Returns

A pandas DataFrame with columns {spiketrain, time_bin, autocorrelation}

spiketimes.df.correlate.cross_corr(df: pandas.core.frame.DataFrame, binsize: float = 0.01, num_lags: int = 100, spiketimes_col: str = 'spiketimes', spiketrain_col: str = 'spiketrain', t_start: float = None, t_stop: float = None, use_multiprocessing: bool = False, max_cores: int = None)[source]

Calculate crosscorrelation between each combination of spiketrains in a DataFrame.

Parameters
  • df – A pandas DataFrame containing spiketimes indexed by spiketrain

  • binsize – The size of the time bin in seconds

  • num_lags – The number of lags forward and backwards around lag 0 to return

  • spiketimes_col – The label of the column containing spiketimes

  • spiketrain_col – The label of the column containing spiketrain identifiers

  • t_start – Minimum timepoint

  • t_stop – Maximum timepoint

  • use_multiprocessing – Whether to use multiple cores to compute cross correlation. Useful for large numbers of spiketrains

  • max_cores – If using multiprocessing, specifies the maximum number of cores to use. Defaults to max available

Returns

A pandas DataFrame with columns {spiketrain_1, spiketrain_2, time_bin, crosscorrelation}

spiketimes.df.correlate.cross_corr_between_test(df: pandas.core.frame.DataFrame, binsize: float = 0.01, num_lags: int = 100, spiketimes_col: str = 'spiketimes', spiketrain_col: str = 'spiketrain', group_col: str = 'group', t_start: float = None, t_stop: float = None, tail: str = 'two_tailed', adjust_p: bool = True, p_adjust_method: str = 'Benjamini-Hochberg', use_multiprocessing: bool = False, max_cores: int = None)[source]

Calculate spike count correlation between all pairs of spiketrains of different groups.

For example: correlate all pairs of fast-spiking and slow-spining neurons.

Parameters
  • df – A pandas DataFrame containing spiketimes indexed by spiketrain

  • binsize – The size of the time bin in seconds

  • num_lags – The number of lags forward and backwards around lag 0 to return

  • spiketimes_col – The label of the column containing spiketimes

  • spiketrain_col – The label of the column containing spiketrain identifiers

  • t_start – Minimum timepoint

  • t_stop – Maximum timepoint

  • tail – Tail for hypothesis test {“two_tailed”, “upper”, “lower”}. Two tailed reccomended

  • adjust_p – Whether to adjust p-values for multiple comparisons.

  • p_adjust_method – If adjusting p-values, specified which method to use {Benjamini-Hochberg’, ‘Bonferroni’, ‘Bonferroni-Holm’}

  • use_multiprocessing – Whether to use multiple cores to compute cross correlation. Useful for large numbers of spiketrains

  • max_cores – If using multiprocessing, specifies the maximum number of cores to use. Defaults to max available

Returns

A pandas DataFrame with columns {spiketrain_1, spiketrain_2, group_1, group_2, time_bin, crosscorrelation, p}

spiketimes.df.correlate.cross_corr_test(df: pandas.core.frame.DataFrame, binsize: float = 0.01, num_lags: int = 100, spiketimes_col: str = 'spiketimes', spiketrain_col: str = 'spiketrain', t_start: int = None, t_stop: int = None, tail: str = 'two_tailed', adjust_p: bool = True, p_adjust_method: str = 'Benjamini-Hochberg', use_multiprocessing: bool = False, max_cores: int = None)[source]

Calculate spike count correlation between all pairs of spiketrains. Also test significance of crosscorrelation.

Significance test performed by comparing observed crosscorrelation to expected cross correlation of poisson spiketrains.

Parameters
  • df – A pandas DataFrame containing spiketimes indexed by spiketrain

  • binsize – The size of the time bin in seconds

  • num_lags – The number of lags forward and backwards around lag 0 to return

  • spiketimes_col – The label of the column containing spiketimes

  • spiketrain_col – The label of the column containing spiketrain identifiers

  • group_col – The label of the column containing group identifiers

  • t_start – Minimum timepoint

  • t_stop – Maximum timepoint

  • tail – Tail for hypothesis test {“two_tailed”, “upper”, “lower”}. Two tailed reccomended

  • adjust_p – Whether to adjust p-values for multiple comparisons.

  • p_adjust_method – If adjusting p-values, specified which method to use {Benjamini-Hochberg’, ‘Bonferroni’, ‘Bonferroni-Holm’}

  • use_multiprocessing – Whether to use multiple cores to compute cross correlation. Useful for large numbers of spiketrains

  • max_cores – If using multiprocessing, specifies the maximum number of cores to use. Defaults to max available

Returns

A pandas DataFrame with columns {spiketrain_1, spiketrain_2, group_1, group_2, time_bin, crosscorrelation, p}

spiketimes.df.correlate.spike_count_correlation(df: pandas.core.frame.DataFrame, binsize: int, spiketimes_col: str = 'spiketimes', spiketrain_col: str = 'spiketrain', min_firing_rate: float = None, t_start: float = None, t_stop: float = None, use_multiprocessing: bool = False, max_cores: int = None)[source]

Calculate pearson’s correlation coefficient of spike counts between all pairs of spiketrains in a dataframe.

Parameters
  • df – A pandas DataFrame containing spiketimes indexed by spiketrain

  • binsize – The size of the time bin in seconds

  • spiketimes_col – The label of the column containing spiketimes

  • spiketrain_col – The label of the column containing spiketrain identifiers

  • min_firing_rate – If selected, selects only bins where the geometric mean firing rate of the two spiketrains exeedes this value

  • t_start – start point for first time bin.

  • t_stop – end point for the last time bin.

  • use_multiprocessing – Whether to use multiple cores to compute cross correlation. Useful for large numbers of spiketrains

  • max_cores – If using multiprocessing, specifies the maximum number of cores to use. Defaults to max available

Returns

A pandas DataFrame with columns {spiketrain_1, spiketrain_2, R_spike_count}

spiketimes.df.correlate.spike_count_correlation_between(df: pandas.core.frame.DataFrame, binsize: int, spiketimes_col: str = 'spiketimes', spiketrain_col: str = 'spiketrain', group_col: str = 'group', min_firing_rate: float = None, t_start: float = None, t_stop: float = None, use_multiprocessing: bool = False, max_cores: int = None)[source]

Calculate spike count correlation between all pairs of spiketrains of different groups.

For example: correlate all pairs of fast-spiking and slow-spining neurons.

Parameters
  • df – A pandas DataFrame containing spiketimes indexed by spiketrain

  • binsize – The size of the time bin in seconds

  • spiketimes_col – The label of the column containing spiketimes

  • spiketrain_col – The label of the column containing spiketrain identifiers

  • group_col – The label of the column containing group identifiers

  • min_firing_rate – If selected, selects only bins where the geometric mean firing rate of the two spiketrains exeedes this value

  • t_start – start point for first time bin.

  • t_stop – end point for the last time bin.

  • use_multiprocessing – Whether to use multiple cores to compute cross correlation. Useful for large numbers of spiketrains

  • max_cores – If using multiprocessing, specifies the maximum number of cores to use. Defaults to max available

Returns

A pandas DataFrame with columns {spiketrain_1, spiketrain_2, R_spike_count}

spiketimes.df.correlate.spike_count_correlation_between_test(df: pandas.core.frame.DataFrame, binsize: int, n_boot: int = 500, spiketimes_col: str = 'spiketimes', spiketrain_col: str = 'spiketrain', group_col: str = 'group', min_firing_rate: float = None, t_start: float = None, t_stop: float = None, tail: str = 'two_tailed', adjust_p: bool = True, p_adjust_method: str = 'Benjamini-Hochberg', use_multiprocessing: bool = False, max_cores: int = None)[source]

Calculate spike count correlation between all pairs of spiketrains of different groups. Also test significance using a bootstrap procedure.

For example: correlate all pairs of fast-spiking and slow-spining neurons.

Parameters
  • df – A pandas DataFrame containing spiketimes indexed by spiketrain

  • binsize – The size of the time bin in seconds

  • n_boot – The number of bootstrap replicates to create.

  • spiketimes_col – The label of the column containing spiketimes

  • spiketrain_col – The label of the column containing spiketrain identifiers

  • group_col – The label of the column containing group identifiers

  • min_firing_rate – If selected, selects only bins where the geometric mean firing rate of the two spiketrains exeedes this value

  • t_start – The start point for first time bin.

  • t_stop – The end point for the last time bin.

  • tail – Tail for hypothesis test {“two_tailed”, “upper”, “lower”}. Two tailed reccomended

  • adjust_p – Whether to adjust p-values for multiple comparisons.

  • p_adjust_method – If adjusting p-values, specified which method to use {Benjamini-Hochberg’, ‘Bonferroni’, ‘Bonferroni-Holm’}

  • use_multiprocessing – Whether to use multiple cores to compute cross correlation. Useful for large numbers of spiketrains

  • max_cores – If using multiprocessing, specifies the maximum number of cores to use. Defaults to max available

Returns

A pandas DataFrame with columns {spiketrain_1, spiketrain_2, R_spike_count}

spiketimes.df.correlate.spike_count_correlation_test(df: pandas.core.frame.DataFrame, binsize: int, n_boot: int = 500, spiketimes_col: str = 'spiketimes', spiketrain_col: str = 'spiketrain', min_firing_rate: float = None, t_start: float = None, t_stop: float = None, tail: str = 'two_tailed', adjust_p: bool = True, p_adjust_method: str = 'Benjamini-Hochberg', use_multiprocessing: bool = False, max_cores: int = None)[source]

Calculate spike count correlation between all pairs of spiketrains of different groups.

For example: correlate all pairs of fast-spiking and slow-spining neurons. Multiprocessing recommeded when computing on large datasets.

Parameters
  • df – A pandas DataFrame containing spiketimes indexed by spiketrain

  • binsize – The size of the time bin in seconds

  • n_boot – The number of bootstrap replicates to create.

  • spiketimes_col – The label of the column containing spiketimes

  • spiketrain_col – The label of the column containing spiketrain identifiers

  • min_firing_rate – If selected, selects only bins where the geometric mean firing rate of the two spiketrains exeedes this value

  • t_start – The start point for first time bin.

  • t_stop – The end point for the last time bin.

  • tail – Tail for hypothesis test {“two_tailed”, “upper”, “lower”}. Two tailed reccomended

  • adjust_p – Whether to adjust p-values for multiple comparisons.

  • p_adjust_method – If adjusting p-values, specified which method to use {Benjamini-Hochberg’, ‘Bonferroni’, ‘Bonferroni-Holm’}

  • use_multiprocessing – Whether to use multiple cores to compute cross correlation. Useful for large numbers of spiketrains

  • max_cores – If using multiprocessing, specifies the maximum number of cores to use. Defaults to max available

Returns

A pandas DataFrame with columns {spiketrain_1, spiketrain_2, R_spike_count}

Population

spiketimes.df.population.population_coupling_df(df: pandas.core.frame.DataFrame, spiketrain_col: str = 'spiketrain', spiketimes_col: str = 'spiketimes', binsize: float = 0.01, num_lags: int = 100, t_start: float = None, t_stop: float = None, return_all: bool = False)[source]

Calculate the population-coupling index between each spiketrain and all others in a DataFrame.

The metric is calculated by computing and standardising cross correlation between an individual spiketrain and the “population spiketrain”, consisting of all other neurons. Large Z score cross correlation at lag=0 is indicative of high population coupling.

Parameters
  • df – A pandas DataFrame containing spiketimes indexed by spiketrain

  • spiketrain_col – The column containing spiketimes

  • spiketimes_col – The column containing spiketrain identifiers

  • binsize – The size of the time bin in seconds

  • num_lags – The number of lags forward and backwards around lag 0 to return

  • t_start – Minimum timepoint

  • t_stop – Maximum timepoint

  • return_all – If true, all time bins and cross correlation values are returned

Returns

A pandas DataFrame containing one row per spiketrain with columns {spiketrain_col, ‘population_coupling’}

Baseline

spiketimes.df.baseline.zscore_standardise_by(df: pandas.core.frame.DataFrame, baseline_start_stop: numpy.ndarray, spiketrain_col: str = 'spiketrain', time_col: str = 'time', data_col: str = 'spike_count', returned_colname: str = 'zscore')[source]

For each spiketrain, convert a data column to zscores using only data from the baseline period.

Parameters
  • df – A pandas DataFrame containing multiple data points per spiketrain

  • baseline_start_stop – A numpy array containing the starting and ending time of the baseline period.

  • spiketrain_col – The column containing spiketrain identifiers

  • time_col – The column containing time points

  • data_col – The column containing data to be zscore standardised

  • returned_colname

Returns

A copy of the passed DataFrame with an additional column containing zscores

Apply

spiketimes.df.apply.apply_by(df: pandas.core.frame.DataFrame, func, func_kwargs: dict = None, spiketimes_col: str = 'spiketimes', spiketrain_col: str = 'spiketrain', returned_colname: str = 'apply_result')[source]

Apply an arbitrary function to each spiketrain in a DataFrame.

The passed function should have a single return value for each spiketrain.

Parameters
  • df – A pandas DataFrame containing spiketimes indexed by spiketrain

  • func – The function to apply to the data

  • func_kwargs – dictionary of key-word arguments to be passed to the function

  • spiketimes_col – The label of the column containing spiketimes

  • spiketrain_col – The label of the column containing spiketrain identifiers

  • return_colname – The label of the column in the returned DataFrame containing the function result

Returns

A pandas DataFrame with columns {spiketrian_col and returned_colname}

spiketimes.df.apply.apply_by_rolling(df: pandas.core.frame.DataFrame, func, num_periods: int = 10, func_kwargs: dict = None, spiketimes_col: str = 'spiketimes', spiketrain_col: str = 'spiketrain', returned_colname: str = 'rolling_result', copy: bool = True)[source]

Apply a function in a roling window along each neuron in a dataframe

Parameters
  • df – A pandas DataFrame containing spiketimes indexed by spiketrain

  • func – funtion to apply along the datafrmae

  • num_period – The number of rows in the rolling window

  • spiketimes_col – The label of the column containing spiketimes

  • spiketrain_col – The label of the column containing spiketrain identifiers

  • returned_colname – The label of the column in the returned DataFrame containing the function result

  • copy – Whether make a copy of the passed to DataFrame before applying the function

Returns

A copy of the passed DataFrame with returned_colname appended