miRNA結合位點預測軟件RNAhybrid的使用教程

RNAhybrid的介紹

RNAhybrid是Behmsmeier M等基於miRNA和靶基因二聚體二級結構開發的miRNA靶基因預測軟件。RNAhybrid預測算法禁止分子內、miRNA分子間及靶基因間造成二聚體,根據miRNA和靶基因間結合能探測最佳的靶位點。儘管隨着靶基因序列長度增長,運算複雜度也相應增長,但RNAhybrid和其它RNA二級結構預測軟件諸如mfold, RNAfold, RNAcofold和pairfold相比,仍具備明顯的速度優點。此外,RNAhybrid容許用戶自定義自由能閾值及p值,也容許用戶設置雜交位點的偏向,如雜交位點必須包含miRNA 5’端2-7nt等。html

1.RNAhybrid的下載與安裝

1 wget https://bibiserv.cebitec.uni-bielefeld.de/applications/rnahybrid/resources/downloads/RNAhybrid-2.1.2.tar.gz
2 tar -xzvf RNAhybrid-2.1.2.tar.gz 3 cd /path/to/ RNAhybrid-2.1.2
4 ./configure 5 sudo make #這裏儘可能使用管理員模式,否則容易出錯
6 sudo make install 

驗證是否安裝成功,能夠輸入which RNAhybrid,如顯示地址,則安裝成功,如下是用win10下的WSL下的ubuntu作的示範:

 

2.輸入文件的準備

1.target sequence(s)

This contains one or more sequences that are used by RNAhybrid to hybridize the miRNA(s) on. RNAhybrid uses all this sequences to find minimal free energy hybridisations between miRNA(s) and target sequence(s). Sequences should be in RNA.fasta format but RNAhybrid can also use DNA.fasta files. A single Sequences one can use can contain up to 50000 basepairs.算法

這裏的target sequence用的是從circbase下載的人的circRNA的fasta文件,具體下載方法參考我這篇博客https://www.cnblogs.com/yanjiamin/p/12057362.htmlubuntu

2.miRNA sequence(s)

contains one or more micro RNA(s) that RNAhybrid uses to hybridize with the RNA sequences and to find the minimal free energy hybridization. A single micro RNA sequence can contain up to 2000 basepairs.網絡

這裏的miRNA sequence用的是從miRbase下載的成熟的人的miRNA的fasta文件,具體下載方法參考我這篇博客https://www.cnblogs.com/yanjiamin/p/12057362.htmlapp

 

3.RNAhybrid的使用

Usage: RNAhybrid [options] [target sequence] [query sequence].less

options:ide

-b <number of hits per target>  #意思是一個miRNA和一個target sequence的某一段序列匹配狀況最多列出幾回,好比一個miRNA和一個target sequence的某一段序列匹配存在多種狀況,則-b 1就是列出最優的匹配狀況,通常選1就比較好。這個最終獲得的數目也與<energy cut-off>的設定值有關。
-c compact output  #使用這個參數,每個匹配只會顯示一行輸出。若是隻想知道結果是否與RNAhybrid校準的結果相同,建議使用這個參數。
-d <xi>,<theta>  #位置和形狀參數
-f helix constraint  #
-h help
-m <max targetlength>
-n <max query length>
-u <max internal loop size (per side)>  #內部成環的錯配鹼基的個數,使用-u 0,將獲得徹底沒有錯配鹼基內部成環的結構。
-v <max bulge loop size>  #internal loop是兩條鏈都沒有結合位點的內部環,而bulge loop是某一條上多出的鹼基的突出
-e <energy cut-off>  #兩條序列匹配的最低自由能,先設置 -e -30看看效果。
-p <p-value cut-off>  
-s (3utr_fly|3utr_worm|3utr_human)  #用於極值分佈參數的快速估計,你能夠選擇nothing,3utr_fly, 3utr_worm和3utr_human來更好的匹配這些物種。你不能同時使用helix constrain和approximate p-value這兩個參數。
-g (ps|png|jpg|all)  #圖片輸出的格式,有ps,png,jpg或者all四個選項
-t <target file>  #fasta格式的target gene文件
-q <query file>  #fasta格式的miRNA文件oop

Either a target file has to be given (FASTA format)
or one target sequence directly.ui

Either a query file has to be given (FASTA format)
or one query sequence directly.this

The helix constraint format is "from,to", eg. -f 2,7 forces
structures to have a helix from position 2 to 7 with respect to the query.

<xi> and <theta> are the position and shape parameters, respectively,
of the extreme value distribution assumed for p-value calculation.
If omitted, they are estimated from the maximal duplex energy of the query.
In that case, a data set name has to be given with the -s flag.


PS graphical output not supported.


PNG and JPG graphical output not supported.

 

Name Description
helix constraint from

Forces all structures to have a helix from position a to position b in respect to the query. The first base has position 1. The parameter "Helix constrain from" has to be lower or equal to the parameter "Helix constraint to". You can not use Helix constraint and approximate p-values at the same time.

hits per target

This Parameter defines how many hits are shown by RNAhybrid. The hits are shown by increasing minimal free energy ( the lower the energy the better the result)

Compact output

When this parameter is used RNAhybrid gives you only one line of output

instead of the whole output it normally generates.

Generate graphics Generates a graphical representation of the output in jpg, png and ps format, if less than 6 hits choosen. If RNAhybrid breaks with an unexpected error, it is often a good idea not to enable the graphical representation generation.
Max internal loop length

The maximal number of unpaired nucleotides in either side of an internal loop.

energy Threshold

Shows the hits with all minimal free energy's lower then the threshold (the lower the result the better). The value has to be lower or equal to zero.

Notice that the output only shows the results that exceed the energy threshold or the maximal hits per target.

Max bulge loop length

the maximal number of unpaired nucleotides in a bulge loop.

No G:U in seed If you click on this you choose weather their are no G:U bindings allowed in the seed or not. This parameter can only be chosen if you also use the parameters "Helix constraint from" and "Helix constraint to".
helix constraint to

see helix constraint this is position b you have to use both parameters to use Helix constraints.

approximate p-value

Used for a quick estimate of extreme value distribution Parameters. You can choose between nothing, 3utr_fly, 3utr_worm and 3utr_human for better equitation within these species. You can not use Helix constraint and approximate p-values at the same time.

 

4.RNAhybrid進行人miRNA的靶位點預測的條件

1.miRNA的第8到12個鹼基和circRNA的必須是徹底配對的,這裏須要設置的參數是-f helix constraint,也就是設置-f 8,12

2.是指上下兩條鏈都錯配造成的錯配環,這種錯配環中任何一條鏈的錯配鹼基不能超過1個,這裏須要設置的參數是-u <max internal loop size (per side)> ,也就是設置-u 1

3.突出環即一條鏈多出了一個鹼基的突出,這種突出環最多突出一個鹼基,這裏須要設置的參數是-v <max bulge loop size> ,也就是設置-v 1

4.容許G:U配對,默認的參數是容許G:U配對,你也能夠設置no G:U in seeds來設置不容許G:U配對

5.末端未配對的突出不能超過兩個鹼基

6.不容許存在連續3個鹼基的錯配

7.總數不超過4個鹼基的錯配

1 RNAhybrid -g jpg -b 1 -e -20 -f 8,12 -u 1 -v 1 -s 3utr_human -t SFTSV_24vscontrol_DEcircBase.fa -q hsa_miRNA.fa>SFTSV_24bscontrol_circRNA_miRNA_RNAhybrid #輸出會直接打印在終端裏,因此建議你在終端以 「>" 輸出保存爲一個文件

 

 

RNAhybrid產生的結果中,設置了-g jpg可是沒有產出jpg文件,不知道爲何

這裏產生的結果需整理成circRNA miRNA格式的包含行名爲circRNA和miRNA的數據框,而後用cytoscape作ceRNA網絡圖。

相關文章
相關標籤/搜索