This repository is for analyzing the public dataset https://allensdk.readthedocs.io/en/latest/visual_coding_neuropixels.html.
python >= 3.10 (legacy: 3.8.16)
Requires Anaconda (conda). Create and use a dedicated environment:
# create env with Python >=3.10 and the Anaconda metapackage
conda create -n allen python=3.10 anaconda -y
# activate the environment
conda activate allen
# install project dependencies into the conda env
pip install -r requirements.txt-
Notebooks: Jupyter notebooks for analyzing and visualizing.
-
Scripts: Scripts for batch processing with automated pipeline.
-
Tools: Test notebooks for developing tool functions.
-
Toolkit: Function modules for this repo.
-
Docs: Documentation for this repo.
-
Legacy: Legacy code for this repo.
-
path_config.json: Set the paths for the cache data and output data.
-
global_settings.json: Set the global settings and parameters for the analysis.
-
output_config.json: Set the format for the output data.
-
sessions.json: List of session IDs for test run of batch processing scripts and sessions blacklist to exclude sessions to process.
-
See Requirements for how to create a conda environment with the necessary dependencies.
-
Set the paths for the cache data and output data in path_config.json. Suggest using the shared directory on the server
/home/shared/Allen_Visual_Coding_Dataas the root directory. -
Make sure the conda environment is activated before running any script.
-
Edit the batch script batch_run_script.sh to run the desired python script under folder
scripts/. Set the argument--session_setto the desired set of sessions to process. Set other arguments if needed. Runpython scripts/[script_name].py -hto see the available arguments for the script. -
Make sure directory
./stdout/exists under the root directory of the repository. This is the directory for the output logs of the batch processing scripts. -
Run the batch script
sbatch batch_run_script.shto process the sessions.
For large workloads, use submit_array_job.sh to split sessions across parallel tasks:
- Edit the configuration variables at the top of the script (
NUM_TASKS,SCRIPT_PATH,SCRIPT_ARGS) - Run
bash submit_array_job.sh
The script automatically splits sessions, runs tasks in parallel, and combines logs after completion.
-
After a script finishes running, check the
batch_logsfolder (seebatch_log_dirin path_config.json) for the logs of printed messages and errors. The parameters used for the script are saved in.jsonfiles in thebatch_logsfolder. -
Common arguments for all scripts:
--session_set: The session set to process the sessions from. Available sets:all,test,selected,optotag.all: All sessions in Allen's database.test: Test sessions listed in'test'key of sessions.json.selected: Selected sessions recorded insession_selection.csvfile in theoutputfolder (seeoutput_base_dirin path_config.json).unselected: Unselected sessions recorded insession_selection.csvfile in theoutputfolder (seeoutput_base_dirin path_config.json), excluding sessions with missing LFP data.optotag: Optotag sessions. (Not implemented yet)
--session_list: List of session IDs to process (space-separated).--session_setargument will be ignored if this is provided.--use_blacklist: Use sessions blacklist to exclude sessions to process to avoid uncaught errors in some sessions that may cause the batch processing to stall. The blacklist is listed in'blacklist'key of sessions.json.--disable_logging: Disable logging to the log file.--array_index: SLURM array task index (0-based). Auto-detected from environment if not provided.--array_total: Total number of array tasks. Auto-detected from environment if not provided.
-
combine_array_logs.py: Combine array job log files into a single file. Called automatically by array job submission.
-
test_batch_logging.py: Test script for batch processing and logging system.
-
- Initial processing: download and cache data from Allen Database. If
--cache_data_onlyis set toTrue, the script will only perform this step and skip further processing. - Find probe channels for the target structure (e.g. VISp).
- Compute CSD for the channels in the structure. If
--skip_compute_csdis set toTrue, the script will skip computing CSD.
Additional processing: After finishing the batch process, run the notebook check_channel_layer_positions to overwrite the probe info file and LFP channels file whose layer positions are missing to avoid errors in further processing.
- Initial processing: download and cache data from Allen Database. If
-
- Calculate average PSD of stimuli.
- Calculate PSD averaged across each condition of drifting gratings stimuli.
-
- Fit FOOOF to the PSD of stimuli.
- Get frequency bands of waves specified in global_settings.json.
- Calculate band power in drifting grating conditions and PSD of stimuli with filtered conditions.
- Save figures. (Optional: set
save_figuretotruein output_config.json)
Additional processing: After finishing the batch process, run the notebook compile_psd.ipynb with
session_set = 'all'to compile the PSD of stimuli and frequency bands of waves for all sessions. -
- Get trial averaged CSD and CSD power in wave bands from FOOOF results for stimuli flashes and drifting gratings.
- Save figures. (Optional: set
save_figuretotruein output_config.json)
For interactive analysis and visualization. Some of them are part of the procedures in Scripts.
-
Initial processing before analyzing a session. Find the LFP channels in the selected cortical structure and get the central channels in each layer.
-
Check the layer positions of the LFP channels in the selected structure. Some sessions may have LFP channels with missing CCF coordinates. The layer positions are guessed by the vertical position of the channels according to the average portion of boundaries between layers estimated from the sessions with CCF coordinates. Overwrite the probe info file and LFP channels file with guessed layer for the channels.
-
Calculate PSD of stimuli of a session and apply FOOOF to fit the PSD. Get frequency bands of waves specified in global_settings.json. Calculate band power in drifting grating conditions and PSD of stimuli with filtered conditions.
-
Compile the PSD of stimuli and frequency bands of waves for all sessions.
- Compute the average PSD across sessions and get the frequency bands of waves.
- Compile frequency bands of waves for all sessions.
- Find sessions with good power of each wave bands of interest.
- Find the frequency band of each wave in the layer of interest in specific stimulus.
- Save results to data files of above steps.
- Save figures of average PSD across sessions. (Optional: set
save_figuretotruein output_config.json)
-
Analyze the trial averaged CSD and CSD power in wave bands from FOOOF results for stimuli flashes and drifting gratings.
Helper notebooks:
-
Compress the data files from the cache or output directory to .zip files with file patterns filtered.
-
Display the figures in the notebook with specific file patterns filtered from the output figure directory.
-
Edit the configuration file to set directories of
cache_dirfor allensdk data cache,output_dirfor result data, andfigure_dirfor result figures. Specify properties infilter_dictfor filtering sessions in the dataset. -
Run Choose_Visual_Data to retrieve data of a target region in a session. Local field potential (LFP) data averaged from groups of channels and an information json file of the selected session and region will be saved to the output folder.
-
Edit
analysis_objectin the configuration file to refer to the selected session and region. Setenter_parameterstotruefor the first time analyzing the selected object. -
Run Analyze_Visual_LFP to analyze power spectrum of the LFP.
-
Run Analyze_Visual_Spike to analyze population firing activities and save result data to the output folder.
-
Run Analyze_Visual_Entrainment to analyze entrainment of units to the oscillation and correlation between properties of units. Step 4 and 5 need to be done before this step.
-
The selected parameters for the analyses are saved in the information json file. If
enter_parametersis set tofalseand analyses from step 4 to 6 are run again, the saved parameters will be used. Ifsave_figureis set totrue, figures will be saved to the figure folder.