Skip to contents

This function generates a dotplot or a heatmap to visualize the average expression of features in each identity of a Seurat object. Credits to Seurat's dev team for the original DotPlot from which data processing of this function is derived from and to Ming Tang for the initial idea to use ComplexHeatmap to draw a dotplot and the layer_fun function that draws the dots. Various new parameters were added to offer more flexibility and customization.

Usage

DotPlot_Heatmap(
  seurat_object,
  assay = "RNA",
  layer = "data",
  data.are.log = TRUE,
  features,
  split.by = NULL,
  idents = NULL,
  split.idents = NULL,
  scale = TRUE,
  rescale = FALSE,
  rescale.range = c(0, 3),
  rotate.axis = FALSE,
  dotplot = TRUE,
  dots.type = "square root",
  dots.size = 4,
  show.noexpr.dots = FALSE,
  col.min = ifelse(isTRUE(scale), -2, 0),
  col.max = ifelse(isTRUE(scale), 2, "q100"),
  data.colors = if (isTRUE(scale)) c("#35A5FF", "white", "red") else "Viridis",
  palette.reverse = FALSE,
  na.color = "grey40",
  background.color = "white",
  idents.colors = NULL,
  show.idents.names.colors = FALSE,
  show.idents.oppo.colors = TRUE,
  split.colors = NULL,
  show.split.names.colors = FALSE,
  show.split.oppo.colors = TRUE,
  order.idents = NULL,
  order.split = NULL,
  order.colors = TRUE,
  kmeans.repeats = 100,
  cluster.idents = TRUE,
  idents.kmeans = 1,
  idents.kmeans.numbers.size = 11,
  cluster.features = TRUE,
  features.kmeans = 1,
  features.kmeans.numbers.size = 11,
  idents.gap = 1,
  features.gap = 1,
  idents.names.size = 9,
  features.names.size = 9,
  features.names.style = "italic",
  row.names.side = "left",
  row.names.width = unit(15, "cm"),
  column.names.angle = 45,
  column.names.side = "bottom",
  column.names.height = unit(15, "cm"),
  inner.border = TRUE,
  outer.border = TRUE,
  data.legend.name = ifelse(isTRUE(scale), "Z-Score", "Average Expression"),
  data.legend.side = "bottom",
  data.legend.direction = "horizontal",
  data.legend.position = "topcenter",
  data.legend.width = 5,
  idents.legend.name = "Clusters",
  show.idents.legend = TRUE,
  split.legend.name = split.by,
  show.split.legend = TRUE,
  legend.title.size = 10,
  legend.text.size = 10,
  legend.gap = 10,
  output.data = FALSE,
  ...
)

Arguments

seurat_object

A Seurat object.

assay

Character. If the Seurat object contains multiple RNA assays, you may specify which one to use (for example 'RNA2' if you have created a second RNA assay you named 'RNA2'. See Seurat v5 vignettes for more information). You may also use another assay such as 'SCT' to pull features expression from.

layer

Character. Formerly known as slot. It is recommended to use 'data'.

data.are.log

Logical. If TRUE, tells the function data are log transformed. If, and only if, layer = 'data', cell expression values are exponentiated (using expm1) so that averaging is done in non-log space (as per DotPlot or AverageExpression's default behavior), after that, average expression values are log transformed back (using log1p). If FALSE, or layer = 'scale.data' or 'counts', cell expression values are not exponentiated prior to averaging.

features

Character. A vector of features to plot.

split.by

Character. The name of an identity in the meta.data slot to split the active.ident identity by.

idents

Character. A vector with one or several identities names in the active.ident identity to use if you only want those (instead of subsetting your object). If NULL, all identities will be used.

split.idents

Character. A vector with one or several identities names in the split.by identity to use if you only want those. If NULL, all identities will be used.

scale

Logical. If TRUE, average expression values for each feature will be scaled using scale and default parameters. The resulting values will be Z-scores (mean subtracted values divided by standard deviation) and not positive average expression values anymore, which is why there will be positive and negative values displayed, depending on if the average expression in a particular identity is below or above the mean average expression from all identities (which is calculated independently for each feature). Caution should be exercised when interpreting results with low number of identities (typically below 5), as small differences in average expression might lead to exacerbated differences when scaled.

rescale

Logical. If TRUE, average expression values will be adjusted using rescale between the first numerical value of the rescale.range parameter (lowest expression) and the second numerical value (highest expression) for each feature. This is different than scale as this doesn't compare values to any mean or standard deviation and is therefore not a Z-score, it only refits each average expression value (independently for each feature) in order to visualize all features in the same dimension regardless of their differences in levels of expression. Caution should be exercised when interpreting results with low number of identities (typically below 5), as small differences in average expression might lead to exacerbated differences when rescaled. Ignored if scale = TRUE.

rescale.range

Numeric. A vector specifying the minimum and maximum values to resize the average expression values and internally passed to rescale. These values are arbitrary and will not change the visualization, only the values in the legend, you need to adjust the col.min and col.max parameters to influence the color scale. Ignored if rescale = FALSE or scale = TRUE.

rotate.axis

Logical. If TRUE, flips the axis, so that features are displayed as rows and identities as columns.

dotplot

Logical. If TRUE, the function will display a dotplot, with dots size proportional to the percentage of cells expressing a feature. If FALSE, the function will instead display a heatmap.

dots.type

Character. Determines the dots size difference between 0 and 100% expression. Either 'square root' (lower difference) or 'radius' (higher difference). Ignored if dotplot = FALSE.

dots.size

Numeric. The size of the dots in the dotplot. Decreasing this parameter helps when displaying a large number of features. Ignored if dotplot = FALSE.

show.noexpr.dots

Logical. If TRUE, the function will display a small dot for features with 0% expression instead of nothing. Ignored if dotplot = FALSE.

col.min

Character or Numeric. The minimum value for the breaks parameter internally passed to colorRamp2. If character, must be a quantile in the form 'qX' where X is a number between 0 and 100. A value of 'q5' or 'q10' is useful to reduce the effect of outlier values (e.g. a very low value that significantly alters the color scale range of all other values).

col.max

Character or Numeric. The maximum value for the breaks parameter internally passed to colorRamp2. If character, must be a quantile in the form 'qX' where X is a number between 0 and 100. A value of 'q95' or 'q90' is useful to reduce the effect of outlier values (e.g. a very high value that significantly alters the color scale range of all other values).

data.colors

Character. Either a character vector of exactly 3 colors, corresponding to the lowest, zero (or middle if scale = FALSE), and highest values in the expression matrix and internally passed to colorRamp2, or a single character value corresponding to the name of a palette and internally passed to the hcl_palette parameter of colorRamp2 (such as 'Inferno', 'Berlin', 'Viridis' etc, check hcl.pals for all palettes available).

palette.reverse

Logical. If TRUE and if data.colors is a palette (such as 'Viridis'), the function will reverse its colors.

na.color

Character. The color to use for missing values (NA).

background.color

Character. The color to use for the background behind the dots. Ignored if dotplot = FALSE.

idents.colors

Character. A vector of colors to use for the active.ident identity, of same length as the number of identities in the active.ident identity or supplied to the idents parameter. If NULL, uses Seurat's default colors.

show.idents.names.colors

Logical. If TRUE, the function will display the colors specified by the idents.colors parameter next to identities names.

show.idents.oppo.colors

Logical. If TRUE, the function will display the colors specified by the idents.colors parameter on the opposite side of identities names.

split.colors

Character. A vector of colors to use for the split.by identity, of same length as the number of identities in the split.by identity or supplied to the split.idents parameter. If NULL, uses a custom set of colors from colors. Ignored if split.by = NULL.

show.split.names.colors

Logical. If TRUE, the function will display the colors specified by the split.colors parameter next to identities names. Ignored if split.by = NULL.

show.split.oppo.colors

Logical. If TRUE, the function will display the colors specified by the split.colors parameter on the opposite side of identities names. Ignored if split.by = NULL.

order.idents

Character or Numeric. A vector specifying either 'reverse' or the levels (as character or as numeric values corresponding to the indexes) of the active.ident identity to order the cells. If cluster.idents = TRUE or Function, only the legend names will be ordered.

order.split

Character or Numeric. A vector specifying either 'reverse' or the levels (as character or as numeric values corresponding to the indexes) of the split.by identity to order the cells. If cluster.idents = TRUE or Function, only the legend names will be ordered. Ignored if split.by = NULL.

order.colors

Logical. If TRUE, the colors for the active.ident identity and the split.by identity will automatically be ordered according to order.idents and order.split. Ignored if order.idents and order.split are NULL.

kmeans.repeats

Numeric. The number of k-means runs to get a consensus k-means clustering. Ignored if idents.kmeans and features.kmeans are equal to 1.

cluster.idents

Logical or Function. If TRUE, the function will cluster the identities. You may also pass an hclust or dendrogram object which contains clustering.

idents.kmeans

Numeric. The number of k-means slices to use for identities clustering.

idents.kmeans.numbers.size

Numeric. The font size of the identities k-means slices numbers. Set to 0 to remove them.

cluster.features

Logical or Function. If TRUE, the function will cluster the features. You may also pass an hclust or dendrogram object which contains clustering.

features.kmeans

Numeric. The number of k-means slices to use for features clustering.

features.kmeans.numbers.size

Numeric. The font size of the features k-means slices numbers. Set to 0 to remove them.

idents.gap

Numeric. The gap between the identities slices. Ignored if idents.kmeans = 1.

features.gap

Numeric. The gap between the features slices. Ignored if features.kmeans = 1.

idents.names.size

Numeric. The font size of the identities names. Set to 0 to remove them.

features.names.size

Numeric. The font size of the features names. Set to 0 to remove them.

features.names.style

Character. The font face of the features names. The Gene nomenclature used by almost all scientific journals require that features names are italicized, therefore the parameter is by default set to 'italic'. Use 'plain' to revert back to regular font face.

row.names.side

Character. The side where the row names will be displayed, either 'left' or 'right'. The dendrogram will be displayed on the opposite side.

row.names.width

Numeric. The width of the row names. Increase this parameter if your row names are truncated.

column.names.angle

Numeric. The angle of rotation of the column names.

column.names.side

Character. The side where the column names will be displayed, either 'top' or 'bottom'. The dendrogram will be displayed on the opposite side.

column.names.height

Numeric. The height of the column names. Increase this parameter if your column names are truncated.

inner.border

Logical. If TRUE, the function will display a black outline around each dot if dotplot = TRUE, or a black border around each cell of the heatmap if dotplot = FALSE.

outer.border

Logical. If TRUE, the function will display an outer border around the plot or around each slice if idents.kmeans and/or features.kmeans are higher than 1.

data.legend.name

Character. The name of the data legend.

data.legend.side

Character. The side where the data legend will be displayed, either 'left', 'right', 'top' or 'bottom'.

data.legend.direction

Character. The direction of the data legend, either 'horizontal' or 'vertical'.

data.legend.position

Character. The centering of the data legend name, there are many options, default option from Heatmap is 'topleft'.

data.legend.width

Numeric. How long the data legend will be, only affects the data legend if data.legend.direction = 'horizontal'.

idents.legend.name

Character. The name of the active.ident identity legend. Ignored if show.idents.names.colors and show.idents.oppo.colors are FALSE.

show.idents.legend

Logical. If TRUE, the function will display a legend for the active.ident identity. Ignored if show.idents.names.colors and show.idents.oppo.colors are FALSE.

split.legend.name

Character. The name of the split.by identity legend. Ignored if split.by = NULL. Ignored if show.split.names.colors and show.split.oppo.colors are FALSE.

show.split.legend

Logical. If TRUE, the function will display a legend for the split.by identity. Ignored if show.split.names.colors and show.split.oppo.colors are FALSE.

legend.title.size

Numeric. The font size of all legend titles.

legend.text.size

Numeric. The font size of all legend texts.

legend.gap

Numeric. The gap between the legends and the plot. This parameter sets the value in the global options of ht_opt, so it will affect all Heatmap objects in the same R session. Use ComplexHeatmap::ht_opt(RESET = TRUE) to restore default parameters.

output.data

Logical. If TRUE, the function will return a list containing a matrix of the average expression data, scaled or not, and another matrix containing the percentage of cells expressing each feature, instead of displaying anything.

...

Additional arguments to be passed to Heatmap, such as show_parent_dend_line, clustering_method_rows, etc, accepts any parameter that wasn't already internally passed to Heatmap (for example, outer.border sets the border parameter of Heatmap, so you will get an error if you try to pass the border parameter in DotPlot_Heatmap).

Value

A Heatmap object, either as a dotplot, or a heatmap, or a list containing a matrix of the average expression data, scaled or not, and another matrix containing the percentage of cells expressing each feature.

Examples

# Prepare data
pbmc3k <- Right_Data("pbmc3k")
pbmc3k.markers <- c("CCR7", "CD14", "CD40LG",
                    "CD79A", "CD8A", "CDKN1C",
                    "GNLY", "CLEC10A", "PPBP")

# Example 1: default parameters
DotPlot_Heatmap(pbmc3k,
                features = pbmc3k.markers)