Event-related Analysis#

This example can be referenced by citing the package.

This example shows how to use NeuroKit to extract epochs from data based on events localisation and its corresponding physiological signals. That way, you can compare experimental conditions with one another.

# Load NeuroKit and other useful packages
import neurokit2 as nk
import pandas as pd
import numpy as np
import seaborn as sns

The Dataset#

Use the nk.data() function to load the dataset located on NeuroKit data folder.

It contains 2.5 minutes of biosignals recorded at a frequency of 100Hz (2.5 x 60 x 100 = 15000 data points).

Biosignals : ECG, RSP, EDA + Photosensor (event signal)

# Download data
data = nk.data("bio_eventrelated_100hz")

This is the data from one participant that saw 4 emotional and neutral images from the IAPS), which we will refer to as events.

Importantly, the images were marked in the recording system (the “triggers”) by a small black rectangle on the screen, which led to the photosensor signal to go down (and then up again after the image). This is what will allow us to retrieve the location of these events.

They were 2 types of images (the condition) that were shown to the participant: “Negative” vs. “Neutral” (in terms of emotion). Each picture was presented for 3 seconds. The following list is the condition order.

condition_list = ["Negative", "Neutral", "Neutral", "Negative"]

Find Events#

These events can be localized and extracted using events_find().

Note that you should also specify whether to select events that are higher or below the threshold using the threshold_keep argument.

# Find events
events = nk.events_find(data["Photosensor"], threshold_keep='below', event_conditions=condition_list)
events

{'onset': array([ 1024,  4957,  9224, 12984]),
 'duration': array([300, 300, 300, 300]),
 'label': array(['1', '2', '3', '4'], dtype='<U11'),
 'condition': ['Negative', 'Neutral', 'Neutral', 'Negative']}

As we can see, events_find() returns a dict containing onsets and durations for each corresponding event, based on the label for event identifiers and each event condition. Each event here lasts for 300 data points (equivalent to 3 seconds sampled at 100Hz).

# Plot the location of event with the signals
plot = nk.events_plot(events, data)

../../_images/f1ed439591daae9181271f07e5855aa12500651bd6dd567ad5b594d6a8b67423.png

The output of events_plot() shows the corresponding events in the signal, with the blue dashed line representing a Negative event and red dashed line representing a Neutral event.

Process the Signals#

Now that we have the events location, we can go ahead and process the data.

Biosignals processing can be done quite easily using NeuroKit with the bio_process() function. Simply provide the appropriate biosignal channels and additional channels that you want to keep (for example, the photosensor), and bio_process() will take care of the rest. It will return a dataframe containing processed signals and a dictionary containing useful information.

# Process the signal
data_clean, info = nk.bio_process(ecg=data["ECG"], 
                                  rsp=data["RSP"], 
                                  eda=data["EDA"], 
                                  keep=data["Photosensor"], 
                                  sampling_rate=100)

# Visualize some of the channels
data_clean[["ECG_Rate", "RSP_Rate", "EDA_Phasic", "Photosensor"]].plot(subplots=True)

array([<Axes: >, <Axes: >, <Axes: >, <Axes: >], dtype=object)

../../_images/ed155a5854ccfaf5e9f46b97841a7689e83f8e7c041780f9bbb487dbe0345f1c.png

Create Epochs#

We now have to transform this dataframe into epochs, i.e. segments (chunks) of data around the events using epochs_create(). We want it to start 1 second before the event onset and end 6 seconds after. These values are passed into the epochs_start and epochs_end arguments, respectively.

Our epochs will then cover the region from -1 s to +6 s (i.e., 700 data points since the signal is sampled at 100Hz).

# Build and plot epochs
epochs = nk.epochs_create(data_clean, events, sampling_rate=100, epochs_start=-1, epochs_end=6)

Let’s plot some of the signals of the first epoch (and transform them to the same scale for visualization purposes).

# Iterate through epoch data
for epoch in epochs.values():
    # Plot scaled signals
    nk.signal_plot(epoch[['ECG_Rate', 'RSP_Rate','EDA_Phasic', "Photosensor"]], 
                   title=epoch['Condition'].values[0],  # Extract condition name
                   standardize=True)  

../../_images/0335e07638d2517dda5f7ebd176624a87ea01f43b80a54540a2906db750ce9a2.png

../../_images/924cf154c84145fa21fc9e7cfffaa8423c9a15d1d9ee9eefce0f2080daac4c18.png

../../_images/78ce842a0cc2e46f6040e768bdea5fe360dfd837bfacafc22624b1ba00bb2c86.png

../../_images/75eed28facd9142136e25887f3c953faacdd4f3c5b150a4687dc92244566dac1.png

Manually Extract Event Related Features#

With these segments, we are able to compare how the physiological signals vary across the different events. We do this by:

Iterating through our object epochs
Storing the mean value of $X$ feature of each condition in a new dictionary
Saving the results in a readable format

We will call these 3 objects epochs-dictionary, the mean-dictionary and our results-dataframe.

df = {}  # Initialize an empty dict to store the results
         
# Iterate through epochs index and data
for epoch_index, epoch in epochs.items():
    df[epoch_index] = {}  # Initialize an empty dict inside of it
                            

    # Note: We will use the 100th value (corresponding to the event onset, 0s) as the baseline

    # ECG ====
    ecg_baseline = epoch["ECG_Rate"].values[100]  # Baseline
    ecg_mean = epoch["ECG_Rate"][0:4].mean()  # Mean heart rate in the 0-4 seconds
    # Store ECG in df
    df[epoch_index]["ECG_Rate_Mean"] = ecg_mean - ecg_baseline  # Correct for baseline

    # RSP ====
    rsp_baseline = epoch["RSP_Rate"].values[100]  # Baseline
    rsp_rate = epoch["RSP_Rate"][0:6].mean()  # Longer window for RSP that has a slower dynamic
    # Store RSP in df
    df[epoch_index]["RSP_Rate_Mean"] = rsp_rate - rsp_baseline  # Correct for baseline

    
    # EDA/SCR ====
    scr_max = epoch["SCR_Amplitude"][0:6].max()  # Maximum SCR peak
    # If no SCR, consider the magnitude, i.e. that the value is 0
    if np.isnan(scr_max):
        scr_max = 0  
    # Store SCR in df
    df[epoch_index]["SCR_Magnitude"] = scr_max

df = pd.DataFrame.from_dict(df, orient="index")  # Convert to a dataframe
df["Condition"] = condition_list  # Add the conditions
df  # Print DataFrame

C:\Users\runneradmin\AppData\Local\Temp\ipykernel_2520\4246861920.py:12: FutureWarning: The behavior of obj[i:j] with a float-dtype index is deprecated. In a future version, this will be treated as positional instead of label-based. For label-based slicing, use obj.loc[i:j] instead
  ecg_mean = epoch["ECG_Rate"][0:4].mean()  # Mean heart rate in the 0-4 seconds
C:\Users\runneradmin\AppData\Local\Temp\ipykernel_2520\4246861920.py:18: FutureWarning: The behavior of obj[i:j] with a float-dtype index is deprecated. In a future version, this will be treated as positional instead of label-based. For label-based slicing, use obj.loc[i:j] instead
  rsp_rate = epoch["RSP_Rate"][0:6].mean()  # Longer window for RSP that has a slower dynamic
C:\Users\runneradmin\AppData\Local\Temp\ipykernel_2520\4246861920.py:24: FutureWarning: The behavior of obj[i:j] with a float-dtype index is deprecated. In a future version, this will be treated as positional instead of label-based. For label-based slicing, use obj.loc[i:j] instead
  scr_max = epoch["SCR_Amplitude"][0:6].max()  # Maximum SCR peak

	ECG_Rate_Mean	RSP_Rate_Mean	SCR_Magnitude	Condition
1	-2.005527	-0.591449	3.114808	Negative
2	-3.119900	-0.009901	0.000000	Neutral
3	1.336250	-0.899208	0.000000	Neutral
4	-3.543494	0.646818	1.675922	Negative

You can save this dataframe (df.to_csv("results.csv")) and proceed to do some actual statistics on it.

Automatic Feature Extraction#

While manual feature creation allows you to compute and extract exactly what you need, using our automated pipeline is a lot easier.

df = nk.bio_analyze(epochs, sampling_rate=100)
df

	Label	Condition	Event_Onset	ECG_Rate_Baseline	ECG_Rate_Max	ECG_Rate_Min	ECG_Rate_Mean	ECG_Rate_SD	ECG_Rate_Max_Time	ECG_Rate_Min_Time	...	RSP_RVT_Baseline	RSP_RVT_Mean	EDA_Peak_Amplitude	EDA_SCR	SCR_Peak_Amplitude	SCR_Peak_Amplitude_Time	SCR_RiseTime	SCR_RecoveryTime	RSA_P2T	RSA_Gates
1	1	Negative	1024	58.962843	1.037157	-7.238706	-3.416404	2.462046	3.035765	5.329041	...	0.225643	-0.037677	1.995617	1	3.114808	4.718169	1.74	NaN	-56.170030	1.776357e-15
2	2	Neutral	4957	64.000846	-0.056683	-5.177317	-3.209327	1.661333	0.011445	1.323319	...	0.169809	0.019351	0.868942	0	NaN	NaN	NaN	NaN	-43.273326	-7.601609e-02
3	3	Neutral	9224	55.976284	5.248206	-1.922230	1.891089	2.279224	2.054363	1.072961	...	0.127095	0.017846	0.026651	0	NaN	NaN	NaN	NaN	-13.138628	1.358487e-01
4	4	Negative	12984	57.505912	0.186396	-5.781774	-2.941543	2.142268	4.768240	2.565093	...	0.080601	0.012434	1.056855	1	1.675922	2.845494	1.73	1.93	-24.326239	1.843122e-01

4 rows × 46 columns

Event-related Analysis

Contents