Skip to content

chenziao/Analyzing_Allen_Visual_Coding_Neuropixels_Dataset

Repository files navigation

Analyzing_Allen_Visual_Coding_Neuropixels_Dataset

Resource

This repository is for analyzing the public dataset https://allensdk.readthedocs.io/en/latest/visual_coding_neuropixels.html.

Requirements

python >= 3.10 (legacy: 3.8.16)

Requires Anaconda (conda). Create and use a dedicated environment:

# create env with Python >=3.10 and the Anaconda metapackage
conda create -n allen python=3.10 anaconda -y

# activate the environment
conda activate allen

# install project dependencies into the conda env
pip install -r requirements.txt

Modules

  • Notebooks: Jupyter notebooks for analyzing and visualizing.

  • Scripts: Scripts for batch processing with automated pipeline.

  • Tools: Test notebooks for developing tool functions.

  • Toolkit: Function modules for this repo.

  • Docs: Documentation for this repo.

  • Legacy: Legacy code for this repo.

Configuration

  • path_config.json: Set the paths for the cache data and output data.

  • global_settings.json: Set the global settings and parameters for the analysis.

  • output_config.json: Set the format for the output data.

  • sessions.json: List of session IDs for test run of batch processing scripts and sessions blacklist to exclude sessions to process.

Analysis Procedures

Setup on the server

  • See Requirements for how to create a conda environment with the necessary dependencies.

  • Set the paths for the cache data and output data in path_config.json. Suggest using the shared directory on the server /home/shared/Allen_Visual_Coding_Data as the root directory.

  • Make sure the conda environment is activated before running any script.

  • Edit the batch script batch_run_script.sh to run the desired python script under folder scripts/. Set the argument --session_set to the desired set of sessions to process. Set other arguments if needed. Run python scripts/[script_name].py -h to see the available arguments for the script.

  • Make sure directory ./stdout/ exists under the root directory of the repository. This is the directory for the output logs of the batch processing scripts.

  • Run the batch script sbatch batch_run_script.sh to process the sessions.

Parallel Processing with SLURM Array Jobs

For large workloads, use submit_array_job.sh to split sessions across parallel tasks:

  1. Edit the configuration variables at the top of the script (NUM_TASKS, SCRIPT_PATH, SCRIPT_ARGS)
  2. Run bash submit_array_job.sh

The script automatically splits sessions, runs tasks in parallel, and combines logs after completion.

Scripts (for batch processing)

  • After a script finishes running, check the batch_logs folder (see batch_log_dir in path_config.json) for the logs of printed messages and errors. The parameters used for the script are saved in .json files in the batch_logs folder.

  • Common arguments for all scripts:

    • --session_set: The session set to process the sessions from. Available sets: all, test, selected, optotag.
      • all: All sessions in Allen's database.
      • test: Test sessions listed in 'test' key of sessions.json.
      • selected: Selected sessions recorded in session_selection.csv file in the output folder (see output_base_dir in path_config.json).
      • unselected: Unselected sessions recorded in session_selection.csv file in the output folder (see output_base_dir in path_config.json), excluding sessions with missing LFP data.
      • optotag: Optotag sessions. (Not implemented yet)
    • --session_list: List of session IDs to process (space-separated). --session_set argument will be ignored if this is provided.
    • --use_blacklist: Use sessions blacklist to exclude sessions to process to avoid uncaught errors in some sessions that may cause the batch processing to stall. The blacklist is listed in 'blacklist' key of sessions.json.
    • --disable_logging: Disable logging to the log file.
    • --array_index: SLURM array task index (0-based). Auto-detected from environment if not provided.
    • --array_total: Total number of array tasks. Auto-detected from environment if not provided.

Utility Scripts

Scripts (execute in order, later scripts may depend on the results of previous scripts)

  1. find_probe_channels.py

    • Initial processing: download and cache data from Allen Database. If --cache_data_only is set to True, the script will only perform this step and skip further processing.
    • Find probe channels for the target structure (e.g. VISp).
    • Compute CSD for the channels in the structure. If --skip_compute_csd is set to True, the script will skip computing CSD.

    Additional processing: After finishing the batch process, run the notebook check_channel_layer_positions to overwrite the probe info file and LFP channels file whose layer positions are missing to avoid errors in further processing.

  2. process_stimuli_psd.py

    • Calculate average PSD of stimuli.
    • Calculate PSD averaged across each condition of drifting gratings stimuli.
  3. analyze_psd_fooof.py

    • Fit FOOOF to the PSD of stimuli.
    • Get frequency bands of waves specified in global_settings.json.
    • Calculate band power in drifting grating conditions and PSD of stimuli with filtered conditions.
    • Save figures. (Optional: set save_figure to true in output_config.json)

    Additional processing: After finishing the batch process, run the notebook compile_psd.ipynb with session_set = 'all' to compile the PSD of stimuli and frequency bands of waves for all sessions.

  4. analyze_csd.py

    • Get trial averaged CSD and CSD power in wave bands from FOOOF results for stimuli flashes and drifting gratings.
    • Save figures. (Optional: set save_figure to true in output_config.json)

Notebooks

For interactive analysis and visualization. Some of them are part of the procedures in Scripts.

  • Find_Probe_Channels

    Initial processing before analyzing a session. Find the LFP channels in the selected cortical structure and get the central channels in each layer.

  • check_channel_layer_positions

    Check the layer positions of the LFP channels in the selected structure. Some sessions may have LFP channels with missing CCF coordinates. The layer positions are guessed by the vertical position of the channels according to the average portion of boundaries between layers estimated from the sessions with CCF coordinates. Overwrite the probe info file and LFP channels file with guessed layer for the channels.

  • Spectral_analysis

    Calculate PSD of stimuli of a session and apply FOOOF to fit the PSD. Get frequency bands of waves specified in global_settings.json. Calculate band power in drifting grating conditions and PSD of stimuli with filtered conditions.

  • compile_psd

    Compile the PSD of stimuli and frequency bands of waves for all sessions.

    1. Compute the average PSD across sessions and get the frequency bands of waves.
    2. Compile frequency bands of waves for all sessions.
    3. Find sessions with good power of each wave bands of interest.
    4. Find the frequency band of each wave in the layer of interest in specific stimulus.
    5. Save results to data files of above steps.
    6. Save figures of average PSD across sessions. (Optional: set save_figure to true in output_config.json)
  • CSD_during_stimuli

    Analyze the trial averaged CSD and CSD power in wave bands from FOOOF results for stimuli flashes and drifting gratings.

Helper notebooks:

  • compress_data

    Compress the data files from the cache or output directory to .zip files with file patterns filtered.

  • review_figures

    Display the figures in the notebook with specific file patterns filtered from the output figure directory.

Analysis Procedures (Legacy)

  1. Edit the configuration file to set directories of cache_dir for allensdk data cache, output_dir for result data, and figure_dir for result figures. Specify properties in filter_dict for filtering sessions in the dataset.

  2. Run Choose_Visual_Data to retrieve data of a target region in a session. Local field potential (LFP) data averaged from groups of channels and an information json file of the selected session and region will be saved to the output folder.

  3. Edit analysis_object in the configuration file to refer to the selected session and region. Set enter_parameters to true for the first time analyzing the selected object.

  4. Run Analyze_Visual_LFP to analyze power spectrum of the LFP.

  5. Run Analyze_Visual_Spike to analyze population firing activities and save result data to the output folder.

  6. Run Analyze_Visual_Entrainment to analyze entrainment of units to the oscillation and correlation between properties of units. Step 4 and 5 need to be done before this step.

  7. The selected parameters for the analyses are saved in the information json file. If enter_parameters is set to false and analyses from step 4 to 6 are run again, the saved parameters will be used. If save_figure is set to true, figures will be saved to the figure folder.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages