清華生物信息學shell
1MOE Key Laboratory of Bioinformatics, Bioinformatics Division TNLIST / Department of Automation, Tsinghua university express
實驗室17年4月的在覈酸研究上的文章,開發了shell script爲主的選擇性剪切方法,分析比較也得出了比較好的結果。ide
對AFE的定量以及功能解釋有了很好的說明,同時對現存的統計模型進行了有優化和新的解釋。oop
首先把咱們第一個外顯子的全部分佈狀況作一個全面的展現和列舉優化
用五種方法在細胞核內、細胞質中進行鏈特異性和非特異性AFE事件的辨認率的比較。spa
Performance assessment of the five methods for FE identication using the reference CAGE data. (A and B) The receiver operating characteristic (ROC) curves of the five methods on non-strand-specic RNA-seq data of the nuclear (A) and cytoplasmic fractions (B) of the KhES cell line. (C and D) The ROC curves of the five methods on strand-specic RNA-seq data of the nuclear (C) and cytoplasmic fractions (D) of the H1-hES cell line. The logistic regression model has the best performance in all cases.3d
Identication and features of FEs across multiple cell types,鑑定在不一樣的細胞系中FE的識別,依據CAGE數據做爲reference dataorm
咱們在A圖的相關性的圖中能夠看出,cor score都比較高。在B圖的比較中能夠看出,在TSS區域CAGE的曲線分佈做爲金標準,CAGE可檢出的爲紅色,不可檢出的爲天藍色,SEASTAR方法的已知的是深藍色,新發現的是藍綠色。能夠發現SEASTAR的結果與CAGE的趨勢基本保持一致同時有相差不大的average coverage;blog
具體舉例:事件
Examples of differentially used AFEs and tandem TSSs between the GM12878 and K562 cell lines. (A) Differentially used AFEs in gene RPS6KA1.(正義鏈) (B) Differentially used AFEs in gene BIN1(反義鏈). (C) Differentially used tandem TSSs in gene ATP6V1E2(反義鏈). (D) Differentially used tandem TSSs in gene SLC35D1(反義鏈).
以後常規操做以後,AFE事件PSI定量完成。畫出AFE的psi隨着多能細胞分化過程的PSI分化圖;同時畫出126個特異性表達的轉錄因子熱圖,一樣的也是change along with IPSC reprogramming process;熱圖展現AFE的PSI以及TF因子的PSI的傳統pearson相關性參數,都有比較明顯的cluster以及特異性的變化趨勢。
右邊的基因貼上去是top10 variable的candidates
最後,針對一個例子,mycn。將其篩選出來的標準爲 P-value=0.00028
we found multiple TFs known to be key regulators of reprogramming including the top ranked N- Myc (Mycn) gene (with P-value of 0.00028). AFEs containing the Mycn motif were signicantly enriched towards the top of the AFEs positively correlated with Mycn expression in our enrichment analysis. We further investigated the expression level of Mycn,as well as the average PSI values of AFEs that contain the Mycn motif and have strong positive correlation with Mycn expression (PCC > 0.5) (Figure 6C). The signicant increase of Mycn expression during iPSC reprogramming (P-value =8.9e–16, ANOVA test) was accompanied by an increase in the relative usage of these AFEs. The coordinated change in expression levels between Mycn and the differentially used AFEs containing the Mycn motif suggests that Mycn binds to and promotes the usage of these AFEs. Mycn is known to play an essential role in the maintenance of pluripotency. Mycn can cooperate with other TFs to reprogram adult cells into other differentiated cells or into iPS cells。 Msx2, another transcription factor identied in our enrichment analysis, is a major driver of de-differentiation in mammalian muscle cells .
Collectively, these data imply that TFs with high scores from the enrichment analysis of differential AFEs play important roles in iPSC reprogramming and the regulation of the pluripotent state.#總結性陳述。
Mycn基因隨着發育階段的基因表達以及關於FEs事件的PSI值的變化的demo