WWW by Supreet Agarwal

SigCom LINCS: data and metadata search engine for a million gene expression signatures

Millions of transcriptomics samples were generated by the Library of Integrated Network-Based Cellular Signatures (LINCS) program. When these data are processed into searchable signatures along with signatures extracted from Genotype-Tissue Expression (GTEx), and Gene Expression Omnibus, connections between drugs, genes, pathways, and diseases can be illuminated. SigCom LINCS is a web-based search engine that serves over 1.5 million gene expression signatures processed, analyzed, and visualized from LINCS, GTEx, and GEO. SigCom LINCS is built from the Signature Commons framework, a cloud-agnostic generic platform that can be used to stand up Data Commons with a focus on searchable signatures. SigCom LINCS provides rapid signature similarity search for mimickers and reversers given sets of up and down genes. Additionally, users of SigCom LINCS can perform a metadata search to find and analyze subsets of signatures, and find information about genes and drugs. SigCom LINCS is findable, accessible, interoperable, and reusable (FAIR) compliant with metadata linked to standard ontologies and vocabularies while all data and signatures within SigCom LINCS are available for download and via a well-documented API. In summary, SigCom LINCS has the potential to accelerate drug and target discovery in systems pharmacology.

ICARUS, an interactive web server for single-cell RNA-seq analysis

This application was designed to guide the user through single cell RNA-seq analysis using the Seurat scRNA-seq analysis toolkit via a tutorial style interface. It offers user control over each of the steps to personalise analysis based on the dataset of interest. Graphical outputs at each analysis step ensures easy and logical interpretation.The purpose of this application is to allow the user to interactively visualize single cell RNA-seq data without the requirement of previous R programming knowledge.

scIMC: a platform for benchmarking comparison and visualization analysis of scRNA-seq data imputation methods

scIMC is composed of two modules: Imputation module and Downstream analysis module. It allows users to perform 12 state-of-the-art imputation methods (six model-based methods and six deep learning-based methods), and comprehensively evaluate and compare their performance in terms of the performance of recovering gene expression, cell clustering, gene differential expression and reconstructing cellular trajectory. scIMC is the first online platform that integrates all available state-of-the-art imputation methods for benchmarking comparison and visualization analysis.

SubcellulaRVis: a web-based tool to simplify and visualise subcellular compartment enrichment

SubcellulaRVis is a tool for visualising enrichment of Gene Ontology Cellular Compartments within gene lists. Input your gene or protein list as a text or .csv input in the box to the left. Make sure to select the correct organism and gene/protein identifier type. You can calculate enrichment based on the whole cell or the endosomal system

Gene-SCOUT: identifying genes with similar continuous trait fingerprints from phenome-wide association analyses

Gene-SCOUT aims to find similar genes to a particular gene of interest where for each gene a unique signature is constructed. The method exploits associations derived from 450,000 exomes sequenced in the UK Biobank, as well as 120,000 samples of metabolomic data. For a given gene, its signature comprises a collection of associations between variants of the gene and phenotypic traits measured in the UK Biobank.

RSAT 2022: regulatory sequence analysis tools

RSAT (Regulatory Sequence Analysis Tools) enables the detection and the analysis of cis-regulatory elements in genomic sequences. This software suite performs (i) de novo motif discovery (including from genome-wide datasets like ChIP-seq/ATAC-seq) (ii) genomic sequences scanning with known motifs, (iii) motif analysis (quality assessment, comparisons and clustering), (iv) analysis of regulatory variations and (v) comparative genomics. RSAT comprises 50 tools. Six public Web servers (including a teaching server) are offered to meet the needs of different biological communities. RSAT philosophy and originality are: (i) a multi-modal access depending on the user needs, through web forms, command-line for local installation and programmatic web services, (ii) a support for virtually any genome (animals, bacteria, plants, totalizing over 10 000 genomes directly accessible). Since the 2018 NAR Web Software Issue, we have developed a large REST API, extended the support for additional genomes and external motif collections, enhanced some tools and Web forms, and developed a novel tool that builds or refine gene regulatory networks using motif scanning (network-interactions). The RSAT website provides extensive documentation, tutorials and published protocols. RSAT code is under open-source license and now hosted in GitHub. RSAT is available at http://www.rsat.eu/.

GenomicSuperSignature facilitates interpretation of RNA-seq experiments through robust, efficient comparison to public databases

GenomicSuperSignature, a toolkit for interpreting new RNA-seq datasets in the context of a large-scale database of previously published and annotated results. As an exploratory data analysis tool, GenomicSuperSignature matches PCA axes in a new dataset to an annotated index of Replicable Axes of Variation (RAV) represented in previously published independent datasets. GenomicSuperSignature also can be used as a tool for transfer learning, utilizing RAVs as well-defined and replicable latent variables defined by multiple previous studies in place of de novo latent variables. The interpretability of RAVs is enhanced through annotations by MEdical Subject Headings (MeSH) and Gene Set Enrichment Analysis (GSEA). Through the use of pre-built, pre-annotated, dimension-reduced RAVs, GenomicSuperSignature leverages knowledge from tens of thousands of samples and from PubMed and MSigDB, to the dataset at hand within seconds on an ordinary laptop. GenomicSuperSignature is implemented as an R/Bioconductor package for straightforward incorporation into popular RNA-seq analysis pipelines.

LipidSuite: interactive web server for lipidomics differential and enrichment analysis

Advances in mass spectrometry enabled high throughput profiling of lipids but differential analysis and biological interpretation of lipidomics datasets remains challenging. To overcome this barrier, we present LipidSuite, an end-to-end differential lipidomics data analysis server. LipidSuite offers a step-by-step workflow for preprocessing, exploration, differential analysis and enrichment analysis of untargeted and targeted lipidomics. This free, user-friendly webserver facilitate differential lipidomics data analysis and re-analysis, and fully harness biological interpretation from lipidomics datasets. LipidSuite is freely available at http://suite.lipidr.org.

CRISPRloci: comprehensive and accurate annotation of CRISPR-Cas systems

CRISPR-Cas systems are adaptive immune systems in prokaryotes, providing resistance against invading viruses and plasmids. The identification of CRISPR loci is currently a non-standardized, ambiguous process, requiring the manual combination of multiple tools, where existing tools detect only parts of the CRISPR-systems, and lack quality control, annotation and assessment capabilities of the detected CRISPR loci. Our CRISPRloci server provides the first resource for the prediction and assessment of all possible CRISPR loci. The server integrates a series of advanced Machine Learning tools within a seamless web interface featuring: (i) prediction of all CRISPR arrays in the correct orientation; (ii) definition of CRISPR leaders for each locus; and (iii) annotation of cas genes and their unambiguous classification. As a result, CRISPRloci is able to accurately determine the CRISPR array and associated information, such as: the Cas subtypes; cassette boundaries; accuracy of the repeat structure, orientation and leader sequence; virus-host interactions; self-targeting; as well as the annotation of cas genes, all of which have been missing from existing tools. In summary, CRISPRloci constitutes a full suite for CRISPR-Cas system characterization that offers annotation quality previously available only after manual inspection. The webserver can be accessed via the following link: https://rna.informatik.uni-freiburg.de/CRISPRloci/. The standalone version can be downloaded from the following GitHub repository: https://github.com/BackofenLab/CRISPRloci.

TIMEOR: a web-based tool to uncover temporal regulatory mechanisms from multi-omics data

Uncovering how transcription factors regulate their targets at DNA, RNA and protein levels over time is critical to define gene regulatory networks (GRNs) and assign mechanisms in normal and diseased states. RNA-seq is a standard method measuring gene regulation using an established set of analysis stages. However, none of the currently available pipeline methods for interpreting ordered genomic data (in time or space) use time-series models to assign cause and effect relationships within GRNs, are adaptive to diverse experimental designs, or enable user interpretation through a web-based platform. Furthermore, methods integrating ordered RNA-seq data with protein-DNA binding data to distinguish direct from indirect interactions are urgently needed. TIMEOR (Trajectory Inference and Mechanism Exploration with Omics data in R), the first web-based and adaptive time-series multi-omics pipeline method infers the relationship between gene regulatory events across time. TIMEOR addresses the critical need for methods to determine causal regulatory mechanism networks by leveraging time-series RNA-seq, motif analysis, protein-DNA binding data, and protein-protein interaction networks. TIMEOR’s user-catered approach helps non-coders generate new hypotheses and validate known mechanisms. We used TIMEOR to identify a novel link between insulin stimulation and the circadian rhythm cycle. TIMEOR is available at https://github.com/ashleymaeconard/TIMEOR.git

http://timeor.brown.edu.

DGLinker: flexible knowledge-graph prediction of disease-gene associations

DGLinker is a tool for the prediction of novel human Disease-Gene associations given a set of genes that are known to be associated with the target human phenotype(s). In brief, utilizing a set of databases of biological and phenotypic information, the tool generates a knowledge graph. An enrichment test is then used to identify predictive features of the genes known to be associated with the target phenotype(s). The total adjacency of every gene with all predictors of each type (the columns of the matrix) is calculated from the graph. The adjacency matrix is then scaled and weighted to produce a final score for every gene. Predictions are made by applying a threshold to this similarity score, with all genes above the threshold predicted as candidate genes. The optimum weighting and score threshold are learned from the known set of associated genes.  DGLinker is available at https://dglinker.rosalind.kcl.ac.uk. The webserver is free and open to all users without the need for registration.

Mechnetor: a web server for exploring protein mechanism and the functional context of genetic variants

Mechnetor lets you quickly explore and visualize integrated protein mechanism data, enabling a better understanding of the functional context of genetic variants.You can enter lists of interacting proteins and/or lists of genetic variants or post-translational modifications. As a result you will get a finer resolution interaction network that enhances mechanistic interpretations of biological processes and variants of interest. https://academic.oup.com/nar/article/49/W1/W366/6291159

b2bTools: online predictions for protein biophysical features and their conservation

https://bio2byte.be/b2btools/

B2bTools provide integrated protein sequence-based predictions via https://bio2byte.be/b2btools/. The aim of predictions is to identify the biophysical behaviour or features of proteins that are not readily captured by structural biology and/or molecular dynamics approaches. Upload of a FASTA file or text input of a sequence provides integrated predictions from DynaMine backbone and side-chain dynamics, conformational propensities, and derived EFoldMine early folding, DisoMine disorder, and Agmata beta-sheet aggregation. These predictions, several of which were previously not available online, capture ’emergent’ properties of proteins, i.e. the inherent biophysical propensities encoded in their sequence, rather than context-dependent behaviour (e.g. final folded state). Online visualisation is available as interactive plots, with brief explanations and tutorial pages included.

DrugComb update: a more comprehensive drug sensitivity data repository and analysis portal

https://drugcomb.org/

Combinatorial therapies that target multiple pathways have shown great promises for treating complex diseases. DrugComb (https://drugcomb.org/) is a web-based portal for the deposition and analysis of drug combination screening datasets. Since its first release, DrugComb has received continuous updates on the coverage of data resources, as well as on the functionality of the web server to improve the analysis, visualization and interpretation of drug combination screens. Here, we report significant updates of DrugComb, including: (i) manual curation and harmonization of more comprehensive drug combination and monotherapy screening data, not only for cancers but also for other diseases such as malaria and COVID-19; (ii) enhanced algorithms for assessing the sensitivity and synergy of drug combinations; (iii) network modelling tools to visualize the mechanisms of action of drugs or drug combinations for a given cancer sample and (iv) state-of-the-art machine learning models to predict drug combination sensitivity and synergy. These improvements have been provided with more user-friendly graphical interface and faster database infrastructure, which make DrugComb the most comprehensive web-based resources for the study of drug sensitivities for multiple diseases.

GEPIA2021: integrating multiple deconvolution-based analysis into GEPIA

http://gepia2021.cancer-pku.cn/

GEPIA (Gene Expression Profiling Interactive Analysis) webserver facilitates the widely used analyses based on the bulk gene expression datasets in the TCGA and the GTEx projects, providing the biologists and clinicians with a handy tool to perform comprehensive and complex data mining tasks. Recently, the deconvolution tools have led to revolutionary trends to resolve bulk RNA datasets at cell type-level resolution, interrogating the characteristics of different cell types in cancer and controlled cohorts became an important strategy to investigate the biological questions. Thus, GEPIA2021, a standalone extension of GEPIA, allowing users to perform multiple interactive analysis based on the deconvolution results, including cell type-level proportion comparison, correlation analysis, differential expression, and survival analysis. With GEPIA2021, experimental biologists could easily explore the large TCGA and GTEx datasets and validate their hypotheses in an enhanced resolution. GEPIA2021 is publicly accessible at http://gepia2021.cancer-pku.cn/.

ProLint: a web-based framework for the automated data analysis and visualization of lipid-protein interactions

https://www.prolint.ca/

The functional activity of membrane proteins is carried out in a complex lipid environment. Increasingly, it is becoming clear that lipids are an important player in regulating or generally modulating their activity. A routinely used method to gain insight into this interplay between lipids and proteins are Molecular Dynamics (MD) simulations, since they allow us to study interactions at atomic or near-atomic detail as a function of time. A major bottleneck, however, is analyzing and visualizing lipid-protein interactions, which, in practice, is a time-demanding task. ProLint (www.prolint.ca), is a webserver that completely automates analysis of MD generated files and visualization of lipid-protein interactions. Analysis is modular allowing users to select their preferred method, and visualization is entirely interactive through custom built applications that enable a detailed qualitative and quantitative exploration of lipid-protein interactions. ProLint also includes a database of published MD results that have been processed through the ProLint workflow and can be visualized by anyone regardless of their level of experience with MD. The automated analysis, feature-rich visualization, database integration, and open-source distribution with an easy to install process, will allow ProLint to become a routine workflow in lipid-protein interaction studies.

CNVxplorer: a web tool to assist clinical interpretation of CNVs in rare disease patients

http://cnvxplorer.com

Copy Number Variants (CNVs) are an important cause of rare diseases. Array-based Comparative Genomic Hybridization tests yield a similar to 12% diagnostic rate, with similar to 8% of patients presenting CNVs of unknown significance. CNVs interpretation is particularly challenging on genomic regions outside of those overlapping with previously reported structural variants or disease-associated genes. Recent studies showed that a more comprehensive evaluation of CNV features, leveraging both coding and non-coding impacts, can significantly improve diagnostic rates. However, currently available CNV interpretation tools are mostly gene-centric or provide only non-interactive annotations difficult to assess in the clinical practice. Here, we present CNVxplorer, a web server suited for the functional assessment of CNVs in a clinical diagnostic setting. CNVxplorer mines a comprehensive set of clinical, genomic, and epigenomic features associated with CNVs. It provides sequence constraint metrics, impact on regulatory elements and topologically associating domains, as well as expression patterns. Analyses offered cover (a) agreement with patient phenotypes; (b) visualizations of associations among genes, regulatory elements and transcription factors; (c) enrichment on functional and pathway annotations and (d) co-occurrence of terms across PubMed publications related to the query CNVs. A flexible evaluation workflow allows dynamic re-interrogation in clinical sessions. CNVxplorer is publicly available at http://cnvxplorer.com.

snpXplorer: a web application to explore human SNP-associations and annotate SNP-sets

https://snpxplorer.net

Genetic association studies are frequently used to study the genetic basis of numerous human phenotypes. However, the rapid interrogation of how well a certain genomic region associates across traits as well as the interpretation of genetic associations is often complex and requires the integration of multiple sources of annotation, which involves advanced bioinformatic skills. We developed snpXplorer, an easy-to-use web-server application for exploring Single Nucleotide Polymorphisms (SNP) association statistics and to functionally annotate sets of SNPs. snpXplorer can superimpose association statistics from multiple studies, and displays regional information including SNP associations, structural variations, recombination rates, eQTL, linkage disequilibrium patterns, genes and gene-expressions per tissue. By overlaying multiple GWAS studies, snpXplorer can be used to compare levels of association across different traits, which may help the interpretation of variant consequences. Given a list of SNPs, snpXplorer can also be used to perform variant-to-gene mapping and gene-set enrichment analysis to identify molecular pathways that are overrepresented in the list of input SNPs. snpXplorer is freely available at https://snpxplorer.net. Source code, documentation, example files and tutorial videos are available within the Help section of snpXplorer and at https://github.com/TesiNicco/snpXplorer.

TIMEOR: a web-based tool to uncover temporal regulatory mechanisms from multi-omics data

http://timeor.brown.edu.

Uncovering how transcription factors regulate their targets at DNA, RNA and protein levels over time is critical to define gene regulatory networks (GRNs) and assign mechanisms in normal and diseased states. RNA-seq is a standard method measuring gene regulation using an established set of analysis stages. However, none of the currently available pipeline methods for interpreting ordered genomic data (in time or space) use time-series models to assign cause and effect relationships within GRNs, are adaptive to diverse experimental designs, or enable user interpretation through a web-based platform. Furthermore, methods integrating ordered RNA-seq data with protein-DNA binding data to distinguish direct from indirect interactions are urgently needed. We present TIMEOR (Trajectory Inference and Mechanism Exploration with Omics data in R), the first web-based and adaptive time-series multi-omics pipeline method which infers the relationship between gene regulatory events across time. TIMEOR addresses the critical need for methods to determine causal regulatory mechanism networks by leveraging time-series RNA-seq, motif analysis, protein-DNA binding data, and protein-protein interaction networks. TIMEOR’s user-catered approach helps non-coders generate new hypotheses and validate known mechanisms. We used TIMEOR to identify a novel link between insulin stimulation and the circadian rhythm cycle. TIMEOR is available at https://github.com/ashleymaeconard/TIMEOR.git and http://timeor.brown.edu.

Arena3D(web): interactive 3D visualization of multilayered networks

http://bib.fleming.gr/Arena3D

Efficient integration and visualization of heterogeneous biomedical information in a single view is a key challenge. In this study, we present Arena3D(web), the first, fully interactive and dependency-free, web application which allows the visualization of multilayered graphs in 3D space. With Arena3D(web), users can integrate multiple networks in a single view along with their intra- and inter-layer connections. For clearer and more informative views, users can choose between a plethora of layout algorithms and apply them on a set of selected layers either individually or in combination. Users can align networks and highlight node topological features, whereas each layer as well as the whole scene can be translated, rotated and scaled in 3D space. User-selected edge colors can be used to highlight important paths, while node positioning, coloring and resizing can be adjusted on-the-fly. In its current version, Arena3D(web) supports weighted and unweighted undirected graphs and is written in R, Shiny and JavaScript. We demonstrate the functionality of Arena3D(web) using two different use-case scenarios; one regarding drug repurposing for SARS-CoV-2 and one related to GPCR signaling pathways implicated in melanoma. Arena3D(web) is available at http://bib.fleming.gr:3838/Arena3D or http://bib.fleming.gr/Arena3D.

ProteoSign v2: a faster and evolved user-friendly online tool for statistical analyses of differential proteomics

http://bioinformatics.med. uoc.gr/ProteoSign

Bottom-up proteomics analyses have been proved over the last years to be a powerful tool in the characterization of the proteome and are crucial for understanding cellular and organism behaviour. Through differential proteomic analysis researchers can shed light on groups of proteins or individual proteins that play key roles in certain, normal or pathological conditions. However, several tools for the analysis of such complex datasets are powerful, but hard-to-use with steep learning curves. In addition, some other tools are easy to use, but are weak in terms of analytical power. Previously, we have introduced ProteoSign, a powerful, yet user-friendly open-source online platform for protein differential expression/abundance analysis designed with the end-proteomics user in mind. Part of Proteosign’s power stems from the utilization of the well-established Linear Models For Microarray Data (LIMMA) methodology. Here, we present a substantial upgrade of this computational resource, called ProteoSign v2, where we introduce major improvements, also based on user feedback. The new version offers more plot options, supports additional experimental designs, analyzes updated input datasets and performs a gene enrichment analysis of the differentially expressed proteins. We also introduce the deployment of the Docker technology and significantly increase the speed of a full analysis. ProteoSign v2 is available at http://bioinformatics.med. uoc.gr/ProteoSign.