Complete our short form to continue

Genestack will process your personal data in accordance to its privacy policy which can be found here. This includes sending you updates by email about our products and content we think it would be of interest to you. You can unsubscribe at any time by clicking the link in the footer of any email we send. By clicking submit you agree that we process your information in accordance with these terms.
Blog, Clinical, EEG, Trends

Unlocking Neuroinformatics: EEG & Multi Omics Synergy with Genestack ODM

11.09.24

Introduction

The integration of diverse data types in biomedical research is becoming increasingly important as scientists strive to develop a more holistic understanding of complex biological systems. One area where this integration is particularly valuable is in the study of brain activity through Electroencephalography (EEG). Traditionally used in clinical and research settings to monitor electrical activity in the brain, EEG data offers insights into neurological conditions, cognitive processes, and brain function.

Genestack's Open Data Manager (ODM) has been a cornerstone for managing Life Science research data, but its capabilities extend beyond traditional bioinformatics applications. This case study explores how EEG data, specifically in the European Data Format (EDF), can be integrated into ODM, combined with other omics data, showcasing the platform's versatility and the wealth of possibilities it offers for neuroinformatics research.

Understanding EEG and EDF

EEG is a technique that measures electrical activity in the brain using electrodes placed on the scalp. It is widely used to diagnose and monitor neurological disorders such as epilepsy, sleep disorders, and brain injuries. The data recorded during EEG sessions are typically stored in EDF, a standardized format designed for storing multichannel bioelectrical signals. EDF files not only contain the raw signal data but also include essential metadata such as sampling rates, electrode locations, and patient demographics.

Understanding EEG and EDF

Example of EEG data visualized [1]

The Challenge

Managing EEG data presents several unique challenges:

1. Data Volume and Complexity: EEG studies can produce vast amounts of data, especially when recorded over extended periods or across multiple channels. Handling this volume of data requires efficient storage solutions and powerful computational tools.

2. Metadata Management: Accurate and comprehensive metadata is essential for contextualizing EEG data, facilitating data sharing, and ensuring the reproducibility of research findings.

3. Integration with Other Data Types: For a comprehensive understanding of brain function, EEG data often needs to be integrated with other data types, such as genetic, imaging, and clinical data. This integration requires a flexible and interoperable data management platform.

4. Security and confidentiality: Security and confidentiality are crucial for EEG medical data research because this data often contains sensitive personal health information, which must be protected to comply with privacy laws and regulations. Additionally, maintaining confidentiality is essential for the ethical conduct of research and ensuring that participants' identities and medical histories are safeguarded.

5. Easy flow and design: If the platform used to manage this data is not user-friendly, it can lead to inefficiencies, errors, and frustration, especially for researchers who may not be highly technical. A steep learning curve or cumbersome interface can slow down the research process, making it difficult to perform tasks like data cleaning, annotation, or visualization effectively.

Why ODM?

Genestack's ODM is ideally suited to address these challenges, providing a robust infrastructure for managing and analyzing complex datasets. Here are some key features of ODM that make it an excellent choice for EEG data management:

1. Scalable Storage and Organization: ODM offers scalable solutions that can accommodate the large volumes of data generated by EEG studies. Its flexible data organization capabilities allow researchers to categorize and manage data effectively, ensuring that it remains accessible and useful for future analyses.

2. Comprehensive Metadata Support: ODM's metadata management capabilities are particularly valuable for EEG data. The platform allows for the detailed annotation of datasets, capturing information about the experimental setup, data collection parameters, and patient characteristics. This metadata is critical for data interpretation, quality control, and facilitating secondary data use. ODM also ensures consistent adherence to agreed vocabularies and ontologies to ensure that data is consistent and findable.

3. Data Integration and Interoperability: ODM supports the integration of EEG data with other types of biological and clinical data, enabling researchers to conduct multidisciplinary studies. The platform's data model can accommodate diverse data formats and standards, facilitating seamless data integration and analysis.

4. Security and Confidentiality: Genestack’s ODM product is developed through establishing requirements that have been cross-referenced and aligned to 21CFR Part 11, ICH Guideline for good clinical practice E6 (R2) and ISO 27002 information security controls. These requirements are a primary input to the design and development of ODM and enable our clients to comply with the applicable regulatory requirements during the use of the product. [2]

5. User friendly: ODM is designed with the user in mind, bringing the possibilities of organizing, curating and retrieving data via both GUI, Python scripts and APIs, depending on your technical expertise.

Case Example: Uploading and Utilizing EEG Data in ODM

EEG data in EDF format is inherently complex, often comprising raw signals, annotations, and header information within a single file. This complexity presents significant challenges for researchers in terms of storage, retrieval, and organization of their studies. In this case study, we explore how EEG data in EDF format can be effectively stored in ODM and discuss the necessary preprocessing steps to optimize the user experience.

For this example, we utilize two EDF files from the study “EEG Signals from an RSVP Task” by Matran et al. (2017) [3], available on the PhysioNet open database.

These selected EDF files were generated using the Rapid Serial Visual Presentation (RSVP) protocol, where images of aerial London were presented at a speed of 5 Hz. The EEG signals were captured from eight channels according to the 10-20 system (PO8, PO7, PO3, PO4, P7, P8, O1, and O2) and are named using the format rsvp_5Hz_{participant_id}{a,b}.edf. The suffix (either "a" or "b") indicates the sequence in which the files were recorded. Each file contains continuous EEG recordings sampled at 2048 Hz, with signals filtered between 0.15 and 28 Hz. The data is measured in microvolts.

Data Preparation and Organisation

Deconstruction of EDF file

The first step involves deconstructing the EDF file to gain a deeper understanding of the contained data. We utilize Python 3 and the Jupyter Notebook framework for this task, leveraging the pyedflib library, which provides an efficient and straightforward way to handle EDF files. By using pyedflib, we can easily extract three DataFrames from an EDF file: raw_signal, header, and annotations.

Alignment with the ODM data model

With the raw data extracted, the next step is to align it with the ODM data model. The following actions are taken to prepare the data:

Raw signal file
Raw signal file (first 10 rows; rsvp_5Hz_02a) - saved as rsvp_5Hz_02a.tsv

Annotations
Annotations (rsvp_5Hz_02a) - saved as rsvp_5Hz_02a.tsv

Header
Header (Samples data with rsvp_5Hz_02a and rsvp_5Hz_02b) - saved as rsvp_5Hz_02.tsv

Data upload

Having prepared files, it is easy to upload them to ODM by using the graphical user interface. Following the steps presented, simply create a new study, choose a template of your choice, and upload samples’ metadata (file saved as rsvp_5Hz_02.tsv).

Subsequently, in the "Data" section, upload the remaining four data files (two raw signals and two annotations - rsvp_5Hz_02a_annotations.tsv, rsvp_5Hz_02b_annotations.tsv, rsvp_5Hz_02a.tsv, rsvp_5Hz_02b.tsv), specifying the delimiter as ".". Now, you have your data ready for further analysis, curation or visualization.

Integrative approach of omics and imaging data

In Genestack's Open Data Manager (ODM), researchers can seamlessly combine omics data—such as genomics, transcriptomics, and proteomics—with brain imaging data like EEG, MRI, PET. This integration is crucial for advancing our understanding of brain diseases because it enables a comprehensive analysis of both molecular and imaging data within a single platform. By unifying these diverse data types in ODM, researchers can uncover connections between molecular pathways and brain structure or function, which might remain hidden when analyzed separately. This holistic approach accelerates the discovery of novel therapeutic targets and biomarkers, facilitating the development of more precise and effective treatments for brain diseases. Additionally, ODM’s user-friendly interface ensures that even complex datasets can be managed and analyzed efficiently, making it an invaluable tool for pioneering new discoveries in brain research. [3]

Integrative approach of omics and imaging data

Integrative approach of omics and imaging data to discover new insights for understanding brain diseases [3]

In ODM, integrating and managing different types of omics data is straightforward and efficient. Once the data is obtained, by using ODM’s user-friendly interface, you can upload these datasets by following a simple step-by-step process. The platform supports direct data import, which allows for smooth integration of different data types into a unified repository. Once uploaded, ODM automatically organizes and indexes the data, making it readily accessible. This streamlined process eliminates the need for complex data handling or manual formatting, ensuring that all relevant information is available in one centralized location. Having all this data in ODM not only simplifies data management but also enhances the ability to quickly retrieve and analyze the information, facilitating a more efficient and comprehensive research workflow.

Example of variants, biomarker and epigenomics data, combined with EEG signal and annotations data stored in ODM

Example of variants, biomarker and epigenomics data, combined with EEG signal and annotations data stored in ODM.
Disclaimer: Epigenomics, Biomarker and Gene variant data is example data and not connected to study “EEG Signals from an RSVP Task” by Matran et al. (2017).

Data Analysis and Visualization

Once the data is housed in ODM, researchers can proceed with analysis using ODM's curation capabilities. For more advanced analyses, ODM supports integration with external tools and software (shown below EEGLAB software as an example), enabling the application of specialized signal processing techniques or machine learning algorithms. This flexibility is essential for addressing the diverse analytical requirements of neuroinformatics research. Furthermore, integrating the output data back into ODM ensures that all research data is centralized, organized, and accessible, streamlining the research process.

Example of Power Spectrum and Scalp Maps generated by EEGLAB toolbox Example of Power Spectrum and Scalp Maps generated by EEGLAB toolbox [5]

Collaboration and Data Sharing

One of the strengths of ODM is its support for collaborative research. The platform allows researchers to share datasets, analysis workflows, and results with collaborators. This feature is particularly valuable for large-scale studies that involve multiple research groups or institutions, as it facilitates data sharing, coordination, and reproducibility.

Conclusion

In the rapidly evolving field of neuroinformatics, integrating diverse datasets is key to unlocking new insights into brain function and disorders. This article has demonstrated how Genestack's Open Data Manager (ODM) can effectively handle EEG data in EDF format, combining it seamlessly with other omics data to create a holistic view of brain activity. By leveraging ODM's robust data management capabilities, researchers can overcome common challenges associated with large-scale EEG data, including data volume, metadata management, and integration with other biological data types.

The ability to integrate EEG data with genomics, transcriptomics, proteomics, and other data types within ODM not only enhances data accessibility and organization but also facilitates comprehensive analyses. This synergy accelerates the discovery of novel biomarkers and therapeutic targets, ultimately advancing our understanding of brain diseases.

Moreover, ODM's compatibility with external tools extends its analytical power, offering researchers the flexibility to apply specialized techniques and algorithms. This integration ensures that complex data analyses can be conducted efficiently, while maintaining a centralized and organized data repository.

The platform’s collaborative features further enhance its value, allowing researchers to share and coordinate efforts across multiple institutions, thus promoting reproducibility and accelerating scientific progress. Genestack's ODM stands out as a versatile and user-friendly solution for managing and analyzing neuroinformatics data, paving the way for groundbreaking research and innovations in understanding brain function and disease.

References

  1. Colorado State University, EEG data tutorial: Getting started. Retrieved September 4, 2024, from www.cs.colostate.edu
  2. Genestack, Security and compliance. Genestack. Retrieved September 4, 2024, from genestack.com
  3. Matran-Fernandez, A., Poli, R., Cinel, C., & Sepulveda, F. (2017). Towards the automated localization of targets in rapid image-sifting by collaborative brain-computer interfaces. PLoS ONE, 12(5), e0178498. doi.org
  4. Jong Hyuk Yoon, Hagyeong Lee, Dayoung Kwon, Dongha Lee, Seulah Lee, Eunji Cho, Jaehoon Kim, Dayea Kim, Integrative approach of omics and imaging data to discover new insights for understanding brain diseases, Brain Communications, Volume 6, Issue 4, 2024, fcae265, doi.org
  5. Neuroelectrics, EEGLAB. Retrieved September 4, 2024, from www.neuroelectrics.com
11.09.24

Sign up for our newsletter