RNA-seq視覺化[2]——如何讓你的熱圖“熱”起來
背景
熱圖(heatmap)
在RNA-seq資料中表示不同組織/細胞等樣本或重複之間不同基因或重複序列等的表達水平
差異
。同時也可以透過
聚類
的方式呈現不同樣本中不同基因的表達變化,從而呈現差異結果。而這種差異可以透過熱圖更好的可視化出來。
資料準備
在我們繪製熱圖之前,首先需要我們已經
標準化
後的RNA-seq
相對定量
結果。我們對於
標準化
存在不同的計算方式,目前主要的就是以下幾種:
1)
RPM(CPM)
=Total
exon reads
/ Mapped reads(Millions);
2)
RPKM
=Total exon reads/[Mapped reads(Millions)*Exon length(Kb)];
3)
RPKM
=Reads Per Kilobase Million; FPKM=Fragments(2×Reads) per Kilobase Million (RPKM is for single-end RNA-seq, FPKM is for paired-end RNA-seq);
Gene/Repeat name
sample1/2/3/。。。
A, B, C, 。。。
RPM(CPM)/RPKM/FPKM
繪製熱圖
一般的表達
矩陣
類似於下圖所示,我們可以將其儲存為。csv格式,便於R讀取(當然,也可以是。xlsx或者。txt格式)。最左側表示gene名稱,每一列代表一個樣本,而數字代表的就是表達量。我們就可以透過表達量來繪製我們需要的熱圖。
forheatmap <- read。csv(file = ‘for_heatmap。csv’,header = T)
rownames(forheatmap) <- forheatmap[,1]
forheatmap <- forheatmap[,-1]
pheatmap::pheatmap(forheatmap,scale = ‘row’,cluster_col = F,show_rownames=T,angle_col = 0)
將我們的表達矩陣儲存在Rproject的資料夾中,再執行以上的R scripts,我們就可以得到我們的熱圖了!(是不是很簡單~)
如果想要改變
熱圖
的顏色,顯示行名/列名,又或者想要變換其他的引數,可以在Console框中輸入
?pheatmap
來檢視pheatmap的所有引數,同時,也會有示例指令碼,可以嘗試練習以下。
Example
# Create test matrix
test = matrix(rnorm(200), 20, 10)
test[1:10, seq(1, 10, 2)] = test[1:10, seq(1, 10, 2)] + 3
test[11:20, seq(2, 10, 2)] = test[11:20, seq(2, 10, 2)] + 2
test[15:20, seq(2, 10, 2)] = test[15:20, seq(2, 10, 2)] + 4
colnames(test) = paste(“Test”, 1:10, sep = “”)
rownames(test) = paste(“Gene”, 1:20, sep = “”)
# Draw heatmaps
pheatmap(test)
pheatmap(test, kmeans_k = 2)
pheatmap(test, scale = “row”, clustering_distance_rows = “correlation”)
pheatmap(test, color = colorRampPalette(c(“navy”, “white”, “firebrick3”))(50))
pheatmap(test, cluster_row = FALSE)
pheatmap(test, legend = FALSE)
# Show text within cells
pheatmap(test, display_numbers = TRUE)
pheatmap(test, display_numbers = TRUE, number_format = “\%。1e”)
pheatmap(test, display_numbers = matrix(ifelse(test > 5, “*”, “”), nrow(test)))
pheatmap(test, cluster_row = FALSE, legend_breaks = -1:4, legend_labels = c(“0”,
“1e-4”, “1e-3”, “1e-2”, “1e-1”, “1”))
# Fix cell sizes and save to file with correct size
pheatmap(test, cellwidth = 15, cellheight = 12, main = “Example heatmap”)
pheatmap(test, cellwidth = 15, cellheight = 12, fontsize = 8, filename = “test。pdf”)
# Generate annotations for rows and columns
annotation_col
= data。frame(
CellType = factor(rep(c(“CT1”, “CT2”), 5)),
Time = 1:5
)
rownames(annotation_col) = paste(“Test”, 1:10, sep = “”)
annotation_row = data。frame(
GeneClass = factor(rep(c(“Path1”, “Path2”, “Path3”), c(10, 4, 6)))
)
rownames(annotation_row) = paste(“Gene”, 1:20, sep = “”)
# Display row and color annotations
pheatmap(test, annotation_col = annotation_col)
pheatmap(test, annotation_col = annotation_col, annotation_legend = FALSE)
pheatmap(test, annotation_col = annotation_col, annotation_row = annotation_row)
# Change angle of text in the columns
pheatmap(test, annotation_col = annotation_col,
annotation_row
= annotation_row, angle_col = “45”)
pheatmap(test, annotation_col = annotation_col, angle_col = “0”)
# Specify colors
ann_colors = list(
Time = c(“white”, “firebrick”),
CellType = c(CT1 = “#1B9E77”, CT2 = “#D95F02”),
GeneClass = c(Path1 = “#7570B3”, Path2 = “#E7298A”, Path3 = “#66A61E”)
)
pheatmap(test, annotation_col = annotation_col,
annotation_colors
= ann_colors, main = “Title”)
pheatmap(test, annotation_col = annotation_col, annotation_row = annotation_row,
annotation_colors = ann_colors)
pheatmap(test, annotation_col = annotation_col, annotation_colors = ann_colors[2])
# Gaps in heatmaps
pheatmap(test, annotation_col = annotation_col, cluster_rows = FALSE, gaps_row = c(10, 14))
pheatmap(test, annotation_col = annotation_col, cluster_rows = FALSE, gaps_row = c(10, 14),
cutree_col = 2)
# Show custom strings as row/col names
labels_row = c(“”, “”, “”, “”, “”, “”, “”, “”, “”, “”, “”, “”, “”, “”, “”,
“”, “”, “Il10”, “Il15”, “Il1b”)
pheatmap(test, annotation_col = annotation_col, labels_row = labels_row)
# Specifying clustering from distance matrix
drows = dist(test, method = “minkowski”)
dcols = dist(t(test), method = “minkowski”)
pheatmap(test, clustering_distance_rows = drows, clustering_distance_cols = dcols)
# Modify ordering of the clusters using clustering callback option
callback = function(hc, mat){
sv = svd(t(mat))$v[,1]
dend = reorder(as。dendrogram(hc), wts = sv)
as。hclust(dend)
}
pheatmap(test, clustering_callback = callback)
## Not run:
# Same using dendsort package
library(dendsort)
callback = function(hc, 。。。){dendsort(hc)}
pheatmap(test, clustering_callback = callback)
## End(Not run)
想要了解更多scripts,可以在我的GitHub主頁檢視: