Genestack Platform Advantages

Smart, format-free files with rich metadata and a detailed provenance audit

Import and export data into Genestack in any format. Our system converts/transforms files automatically for various tools. You can forget data grooming, cleansing or munging, let Genestack OS deal with that.

 

Files in Genestack are not the same files as those you will find on your machine, even though we call them by the same name. For example, a .fastq file contains short reads from a sequencer with qualities. But so does a .sra file from the Short Read Archive, but in a different format. Our files are format-free biologically meaningful objects: a set of sequences, or a reference genome.

When working with these files you are guaranteed that they contain the right information, and you do not have to worry what format that information is stored in. The platform will automatically take care of any conversions, adaptations as necessary.

 

Moreover, metadata attributes record where each file came from, how it was generated, and so forth. Metadata fields are typed, meaning that if a field is supposed to be numeric, it will necessarily contain a number, and if a field is supposed to be a publication with a PubMed ID, it will definitely have such an ID. This means that both users and the system can impose consistency constraints on the data. Smart files with rich provenance and metadata are a superb platform for data integration.

Data security, protection and reproducibility with our data flow mechanism

Our system allows you to conduct an analysis on a small number of samples, try different combinations of tools and parameters, generate some visualisations, in other words, refine your analysis methods. You can take the result then, view its provenance data flow, and reuse it, replacing selected input files in the data flow you captured, and create new output files. Data flows can be shared, like any other file in the system, of course.


When working on a collaborative project it is useful to create a user group. Any user can do this, and then invite others to join it. After that files can be shared with each group and linked to group folders. There is no limit on how many groups you create or how you share and organise files with your collaborators.

Genestack files are protected from accidental modifications. For any file the tool versions, parameters and data dependencies, i.e., its provenance, can be viewed and replayed to repeat, modify and share analyses.

 

When files are created in Genestack, they contain all the parameters needed for computation, but the computation is not started. Parameters can still be modified, dependencies changed, and so forth. Once computation starts, the file is frozen and protected from any changes.

Extensive interactive analysis and data visualisation genomics toolbox

We want you to be able to conduct common genomics analyses easily on Genestack. So, we integrated industry best practice individual tools and workflows for processing DNA-Seq, RNA-Seq, ChIP-Seq and BS-Seq data, from QC to differential expression or variant calling & annotation as well as visualisation and much more.

 

In Genestack you will find a growing toolbox of genomics applications. A large number of these are wrappers around open source tools that have become the industry standard over the years. Our wrappers make underlying tools file format independent: what used to work only with BAM files, now can work with other formats.

For very complex tools we worked hard to present a simple interface that uses biological metadata from input files to suggest options. Advanced users can specify detailed command line parameters. Multiple versions of original tools can live concurrently on the system, and a record is kept of which tool is used in every analysis.

 

Some of the tools in Genestack are more than wrappers around open source tools. We have our own Genome Browser with super collaborative features, such as shared, editable views and genome visualisations built into reproducible analytic workflows. Our quality control, differential expression, variant filtering and functional annotation applications combine existing open source tools with our own technologies to be fast, scalable and format-free.

Free access to public data from major archives and specialised databases

Wherever possible, we try to reduce complexity of public data: if the same data item is present in multiple repositories, we only create one Genestack file and attach the metadata from all sources to it.

 

A common use-case for public data is automated mining. To make it easier and faster to write data mining applications, we add value to public datasets via curated fields. For a small number of selected attributes (organism, technology, and some others), we work to ensure and check correctness of values, harmonize them, and so forth. You can see which fields are curated in the metainfo viewer, or see developer’s documentation for API-level details.

Search and use data and metadata from public data archives, including ArrayExpress & GEO, ENA & SRA, as well as reference genomes and tracks from Ensembl & UCSC, and more, all free, in one place.

 

Every Genestack user has free access to public data – smart Genestack files imported from multiple public resources. Each file is a biological object, ready to be used in various applications. We import and index all metadata from source databases, which means you can use Genestack as public data search engine.

Public and Private Cloud, PaaS and Customer Hardware

Genestack.Org is the community edition of the Genestack platform, running as a cloud service only. Here you can access for free, with unlimited storage and compute capacity, our growing set of genomics tools, curated datasets, indexed and format-free data from public repositories worldwide. It is also a place for you to host, share and collaborate on your data and applications.

 

Depending on your needs, Genestack is also available as Platform-as-a-Service, deployed on a shared or private cloud,

with customized, adaptable compute and storage capacity and enterprise-level features, support and training. Flexible payment models exist allowing you to control your bioinformatics resource spending.

 

Genestack can also be installed on your hardware, in your in-house data center, customized to your needs, with cloud-burst capacity. We have worked with industry leading hardware vendors and can provide a complete turnkey hardware and software solution.

Toolkit for bioinformatics developers to build and publish apps

An application can be as simple as a wrapper around a script, in Python, R or another language, or around a Unix-based command line utility, we support multiple concurrent versions and you can allow users to choose which version of an aligner they want to run.

 

Moreover, you can build fully functional interactive applications. Our SDK includes UI widgets and visualisation features, packages various graphical components and allows you to easily construct powerful applications that go beyond the command line.

If you are a bioinformatician, you probably have written your share of scripts. We give you the tools to take your scripts to the next level. They’ll become format-independent, scalable, robust and reproducible.

 

Every Genestack user belongs to an organization, and every organization is assigned a vendor identifier. All users can create and publish new applications. You can share an application with other users or publish it for everyone to use.

Not yet convinced?  

Check out our demos and tutorials

Get started now:

Sign Up for free

Message

Name:
Email:
Organization:
Role:
Subject:
Message:

×

Thank you for subscribing

Your subscription has been confirmed.You've been added to our list and will get a message with our news soon!

Let your friends know about Genestack.

×

Thank you for signing up.

Check your inbox for a confirmation link and some tips on getting started.

Let your friends know about Genestack.

×

Oops, something went wrong!

Please try again, or register via the sign up page.

×