Genes are not randomly distributed in the genome. In humans, 10% of protein-coding genes are transcribed from bidirectional promoters and many more are organised in larger clusters. Intriguingly, neighbouring genes are frequently coexpressed but rarely functionally related. We could show recently that coexpression of bidirectional gene pairs, and closeby genes in general, is buffered at the protein level (Kustatscher et al., 2017). Taking into account the 3D architecture of the genome, we found that co-regulation of spatially close, functionally unrelated genes is pervasive at the transcriptome level, but does not extend to the proteome. Non-functional mRNA coexpression in human cells appears to arise from stochastic chromatin fluctuations and direct regulatory interference between spatially close genes. Protein-level buffering likely reflects a lack of coordination of post-transcriptional regulation of functionally unrelated genes. Grouping human genes together along the genome sequence, or through long-range chromosome folding, is associated with reduced expression noise. Our results support the hypothesis that the selection for noise reduction is a major driver of the evolution of genome organisation. The large presence of non-functional coexpression of genes at the transcript but not protein level suggests that proteomics data should surpass transcriptomics data when screening for functional links between genes. We decided to follow up on this by collating protein expression datasets and mining them for functional protein associations with machine-learning.
The annotation of protein function is a longstanding challenge of cell biology that suffers from the sheer magnitude of the task. We therefore developed ProteomeHD, which documents the response of 10,323 human proteins to 294 biological perturbations, measured by isotope-labelling mass spectrometry. Using this data matrix and robust machine learning we create a co-regulation map of the cell that reflects functional associations between human proteins and that outperforms predictions done by STRING based on the NCBI GEO repository currently holding mRNA expression profiling data from more than one million human samples. Our map identifies a functional context for many uncharacterized proteins, including microproteins that are difficult to study with traditional methods. Co-regulation also captures relationships between proteins which do not physically interact or co-localize. For example, co-regulation of the peroxisomal membrane protein PEX11 with mitochondrial respiration factors led us to discover a novel organelle interface between peroxisomes and mitochondria in mammalian cells. The co- regulation map can be explored at www.proteomeHD.net.
Our lab is also continuing its development of cross-linking/mass spectrometry as a tool to investigate in cells structures of proteins and their complexes.
Kustatscher, G., Grabowski, P., and Rappsilber, J. (2017). Pervasive coexpression of spatially proximal genes is buffered at the protein level. Mol. Syst. Biol. 13, 937.
Kolbowski, L., Mendes, M.L., and Rappsilber, J. (2017). Optimizing the Parameters Governing the Fragmentation of Cross-Linked Peptides in a Tribrid Mass Spectrometer. Anal. Chem. 89, 5311– 5318.
Schneider, M., Belsom, A., and Rappsilber, J. (2018). Protein Tertiary Structure by Crosslinking/Mass Spectrometry. Trends Biochem. Sci. 43, 157–169.