Modul & Function Descriptions

This document describes the main Python modules and functions used in the calcium imaging and electrical stimulation data pipeline.

`pipeline_script.py`

The main script to run the data processing pipeline. Configure paths and parameters here. The meaning of changeable values are explained more within the script.

RUN_MESC_PREPROCESS: calls mesc_tiff_extract.analyse_mesc_file()
RUN_PREPROCESS: runs frequency_to_save.frequency_electrodeRoi_to_save, mesc_data_handling.tiff_merge, mesc_data_handling.extract_stim_frame
RUN_ANALYSIS_PREP: calls various analysis-related functions from functions.py
PLOTS, PLOT_BTW_EXP, RUN_CELLREG_PREP, RUN_CELLREG, RUN_CELLREG_ANALYSIS: control plotting and CellReg-related processes

`general.py`

Contains utility functions for the TIFF extraction from MESc: - ascii_to_str: converts arrays of ASCII codes to strings - find_frame_index_from_timestamp: timing data to frame indices

`mesc_loader.py`

Parses .mesc file metadata using xmltodict to extract imaging parameters. Includes hardcoded test paths that should be customised.

`mesc_tiff_extract.py`

`analyse_mesc_file()`

Extracts TIFF images, saves mesc_data.npy, trigger.txt, fileId.txt, frameNo.txt. Uses the function extract_useful_xml_params() from the python file mesc_loader. NB! Don’t forget to check the stim_trig_channel value and match it to the MESc recordings.

`frequency_to_save.py`

`frequency_electrodeRoi_to_save()`

Saves frequency and electrode ROI info into .npy files. Saves electrode ROI info: all zeros automatically into ELECTRODE_ROIS.NPY Saves frequency info: into FREQUENCIES.NYP: have 3 options (single frequency for all, repeating pattern, manual enter) on how to fill the container, which will be asked in the command line.

`mesc_data_handling.py`

`tiff_merge()`

Merges multiple TIFFs based on experimental parameters. Saves merged TIFFs and associated frequency and ROI info.

It creates the merged_tiffs folder, receives the numbers given in pipeline_script.py and merges (concatenates) the corresponding tiffs, which are saved to their own separate folders. The nomenclature of them as follows:
Folder name: merged_expreimentname_MUnit_number1_number2
Tiff file name: merged_expreimentname_MUnit_number1_number2.tif
If stimulation is True, it saves SELECTED_FREQS.NPY which uses FREQUENCIES (from Frequency_to_save) to save the corresponding frequencies.
And it also saves SELECTED_ELEC_ROI.NPY which is always the first ROI in the list, so the ROI with the number 0 ID. This part of the code was written at the beginning and is not often used.

`extract_stim_frame()`

Saves FRAMENUM.NPY and STIMTIMES.NPY which are the frame numbers and the stimulation timepoints for the merged tiff files.

`suite2p_script.py`

This script is based on the notebooks on the Github page of: Suite2p. The parameters for GCaMP6f & GCaMP6s indicator are defined in it. Additional explanations and descriptions of the parameter lists of functions are at the beginning of each script.

`run_suite2p()`

This function runs Suite2p on the merged TIFF files with the base parameters defined in the documentations of Suite2p. In the if statements (gcamp == f & gcamp == s) you can specify these parameters. Most of the time we modify the threshold_scaling and the spatial_scale. The tau is set according to the indicators already, but you can modify that as well if really needed. The parameters are saved in the suite2p_params.txt.

`functions.py`

This script contains various functions for data analysis and visualization. Each function is modular and can be called independently.

`stim_dur_val()`

Calculates and saves the stimulation duration for each merged TIFF file based on frequencies stored in selected_freqs.npy - Loads selected_freqs.npy from the corresponding folder. - Saves the result to stimDurations.npy to the same directory.

`electrodeROI_val()`

Saves the selected electrode ROI number into a file called electrodeROI.npy. - Needs selected_elec_roi as input, which is the integer ROI number designated as the electrode.

`dist_vals()`

Calculates and saves distances between ROIs and the electrode ROI.

Extracts the ROI centroid coordinates (med) of all detected cells. Calculates the Euclidean distance between each ROI and a predefined electrode ROI. Saves results in both .npy and .csv formats.
Output is:
- ROI_numbers.npy: Contains the Suite2p ROI indices of all detected cells
- distances.npy: Euclidean distances from each ROI to the electrode ROI
- elec_roi_info.csv: Summary CSV file with ROI indices, centroid positions, electrode med position, and distances
Attention: Suite2p saves the med in the format of (y,x) !
--> ROI_numbers.npy is needed for further calculations in timecourse_val

`spontaneous_baseline_val()`

This function is for spontaneous recordings ( e.g.: chronic recording without stimulation). It calculates the baseline corrected traces using a pre-defined, fixed frame window & plots and saves the calcium traces for a selected list of ROIs. - Output is .svg plots for each ROI in list_of_roi_nums & F0.npy baseline-corrected trace

`baseline_val()`

Calculates F0 baseline before stim onset using stimTimes.npy. The baseline is defined as the period before the first stimulation time. - Needs stimTimes.npy which is saved in extract_stim_frame in mesc_data_handling.py - Output is F0.npy into the suite2p/ directory.

`activated_neurons_val()`

This function identifies activated neurons based on a statistical threshold applied to the fluorescence signal. It generates a list of activated ROIs and saves them in a CellReg-compatible .mat files. It also saves the activated_neurons.npy file which contains: ROI_numbers, thresholds, activated_neurons

Detects and saves activated neurons across stimulation blocks based on baseline corrected fluorescence and a statistical threshold (average of baseline period + standard deviation). Also creates a CellReg-compatible .mat file of activated ROI masks.
Usually good for experiments with multiple stimulation repeats and trials. Output is:

Activated_neurons.npy: Contains a DataFrame with ROI numbers, calculated thresholds, and binary activation results for each ROI per stimulation repeat.
*block*_cellreg_input.mat: A CellReg-compatible mask stack of activated ROIs.

`timecourse_val()`

Analyzes per-trial traces and stimulation effects across time.
Analyzes calcium signal responses during stimulation and resting phases across multiple blocks and trials.

Output is:

results.npz: A compressed NumPy file containing
stimResults: Binary array [ROI, block, trial] — 1 if above threshold during stimulation:
restResults: Binary array — 1 if above threshold during resting phase
stimAvgs: Mean ΔF/F during stimulation
restAvgs: Mean ΔF/F during rest
baselineAvgs: Baseline values per block per ROI
full_trial_traces: Concatenated trial traces (stim + rest)

`data_analysis_values()`

Generates multiple summary plots, e.g., active cell count, avg amplitude, etc.

Generates multiple .svg figures with subplots summarising activation results.

Number & fraction of active neurons per block, Average calcium amplitudes across all ROIs for each block & across all blocks for each trial, Trial-wise traces, Distance from electrode vs. response
NB!: don’t forget to update the legend values, because it can’t plot the figures if there is a mismatch in the length of files & legend values

`plot_stim_traces()`

Analyses the calcium responses of a specific ROI and a population of neurons to electrical stimulation across trials. It extracts stimulus-locked traces, identifies activated ROIs, computes average responses, and generates visualisations and data files for further analysis or cell registration.

Calculations:

Trace extraction

For roi_idx, extracts a 4-second time window from the calcium traces (1s before to 3s after each stimulation) across all stimulations and repeats.
Stores in all_traces[repeat, stim_idx]. ROI coordinate analysis:

For each stimulation, computes the average (x, y) position of activated ROIs, but since then the CoM script was written which makes this part of the code redundant - Saves a .csv with average ROI coordinates per stimulation and repeat.

Distance from artificial origin:
Computes distances of all ROIs from an artificial origin (center of FOV).

Saves pixel distances in dist_from_o_pix.txt.

Activation detection:

For every valid ROI (iscell==1):

Computes the average signal during each stimulation window.
Compares to a threshold: mean(baseline) + threshold_value × std(baseline).
Saves binary activation (0/1) in activation_results[roi_id][repeat][stim_idx]. For every valid ROI (iscell==1):
Computes the average signal during each stimulation window.
Compares to a threshold: mean(baseline) + threshold_value × std(baseline).
Saves binary activation (0/1) in activation_results[roi_id][repeat][stim_idx].

The activation_results dictionary contains a binary value for each of the rois, 1 if the cell was activated and 0, if it wasn’t.

Plot out activated neurons:

Stim activation counts is redundant as well. In the .csv there is the list of rois which are activated per repeat.

CellReg-compatible output: Constructs binary masks of activated ROIs & saves masks into .mat files under cellreg_files/. Activation summary:

Optionally you can save the activation DataFrame as well, which contains the activation results: counts the number of activated ROIs per stim & repeat. - Saves a .csv with lists and counts of activated ROIs (stim_activation_counts_fileX.csv).

Extractin traces per amplitude:
Creates sum_avg_dir/ folder which contains the traces of rois per amplitude in .npy files, a .csv containing the average coordinates of the rois per repeat and stim, and a .csv containing the activation results per amplitude.

Plots:

`roi_map_per_stim.svg`

Grid of stimulation footprints: one subplot per (repeat × stimulation), showing which ROIs were activated.

`stim_traces_grid.svg`

Calcium traces for roi_idx, organized by repeat and stim amplitude, with stimulation onset marked.

`overlapping_per_trial_for_roiX.png`

Overlapping traces of roi_idx across all stimulations per repeat. Color-coded by amplitude.

`overlapping_per_param_for_roiX.png`

Overlapping traces of roi_idx across all repeats, grouped by stimulation amplitude.

`sum_avg_traces_subplot.svg`

Grand-average trace of all activated ROIs for each amplitude. One plot per amplitude.

Saved files:

`.csv:`

activation_results_fileX.csv: binary activation per ROI.
avg_x_y_per_repeat_stim_fileX.csv: mean coordinates.
stim_activation_counts_fileX.csv: activation summary.
activation_counts.csv: total count of active ROIs per amplitude.

`.npy:`

Grand average trace of activated ROIs per amplitude (sum_avg_XXuA.npy).

`.mat:`

ROI masks for CellReg, saved per amplitude (pix_data_for_XXuA.mat, activated_mask_XXuA.mat)

`.svg / .png:`

all plots

`plot_across_experiments()`

Gets the data from the sum_avg_dir/ folder which is created in plot_stim_traces(). Plots overlaid average calcium traces across multiple stimulation experiments, grouped by stimulation amplitude. Loads previously saved average traces per stimulation amplitude (from sum_avg_<amplitude>uA.npy files) and overlays these traces in subplots.

`analyze_merged_activation_and_save()`

Analyses neuronal activity across multiple blocks and files, identifying activated neurons based on a statistical threshold applied to the fluorescence signal during stimulation periods. The analysis is performed block by block using metadata files and preprocessed Suite2p outputs. The difference between the previous functions is that you can modify stimulation parameter values & suit it to multiple protocols (e.g.: it1’s good for current steering experiments where there’s 1 repeat of 10 trials) It also saves the results in a CellReg-compatible .mat file, which can be used for further analysis in CellReg.

`CoM.py`

inverse_distance_weighted_center_of_mass()

Calculates the center of mass from a pre-defined center ( this is set to (0,0) which is the upper left corner of the FOV)

plot_weighted_com()

coords_list is an array which holds the med values of the activated rois as np.arrays within the individual blocks (tiffs), these values are from the med_of_act_ns_[…].csv which is saved in the analyze_merged_activation_and_save function of functions.py file, so you have to run that first. (You can convert the med_of_act_ns_[…].csv to a list of arrays with the help of ChatGPT really quickly.)
In the plot_order list, you are to specify the order of recordings based on the experiment records. This is important because the plot is created according to the order of the files. Also pay attention to the inverted axis’ (it’s tricky because of Suite2p’s stat.npy output. The y is inverted. So be sure to double check it and reference it back to the original recording.

`cellreg_process.py`

Information on how to run CellReg from the MATLAB GUI is available here.

`suite2p_to_cellreg_masks()`

Good for creating the .mat files from spontaneous recordings or single recordings (basically files which don’t require any kind of additional calculations to separate stimulations) Saves the result .mat files into the new cellreg_files/ directory.

cellreg_analysis_overlap()

After running CellReg it saves the cellRegistered[date].mat file which is a struct containing information from the cell registration procedure. Don’t forget to update this file name, otherwise you’ll get false results. This function uses cell_to_index_map: “A matrix of size NxM, with the mapping of each registered cell to the indices in each registered session” which we use to calculate the overlap between each session with a pairwise manner. The output is the session_pair_overlap.csv which contains A & B compared sessions, the number of overlapped cells and the percentage of this number compared to the whole.

`single_block_activation()`

Based on the given parameters, it calculates the activated cells for each stimulation set and saves the .matfiles for them. This function was written before analyze_merged_activation_and_save() in functions.py, so since then this is made redundant. But it can still be useful if you want to only save the .mat files.

`cellreg_analysis.py`

run_cellreg_matlab():

This is where the magic happens. If that is done then you only need to run this function to run the CellReg analysis. 1. Check in the script if the path to the MATLAB script is correct. 2. Make sure the data_path and the other values are correct, and then you can run the Matlab script from Python.

Note: Many functions save data in .npy, .csv, .svg, .mat formats as part of the pipeline's modular output. Please also look at the functions carefully, because there are a lot of containers which can be saved optionally, but right, now those lines are commented out.