In data analysis you can sort your data according to a certain variable in the dataset. In R, we can use the help of the function order(). In R, we can easily sort a vector of continuous variable or factor variable. Arranging the data can be of ascending or descending order.

Syntax:

sort(x, decreasing = FALSE, na.last = TRUE)

Argument:

  • x: A vector containing continuous or factor variable
  • decreasing: Control for the order of the sort method. By default, decreasing is set to FALSE.
  • last: Indicates whether the NA ’s value should be put last or not

Example 1 For instance, we can create a tibble data frame and sort one or multiple variables. A tibble data frame is a new approach to data frame. It improves the syntax of data frame and avoid frustrating data type formatting, especially for character to factor. It is also a convenient way to create a data frame by hand, which is our purpose here. To learn more about tibble, please refer to the vignette: https://cran.r-project.org/web/packages/tibble/vignettes/tibble.html

library(dplyr)
set.seed(1234)
data_frame <- tibble(  
    c1 = rnorm(50, 5, 1.5),   
    c2 = rnorm(50, 5, 1.5),  
    c3 = rnorm(50, 5, 1.5),
    c4 = rnorm(50, 5, 1.5),     
    c5 = rnorm(50, 5, 1.5)
)
# Sort by c1
df <- data_frame[order(data_frame$c1), ]
head(df)
c1 c2 c3 c4 c5
1.481453 3.477557 4.246283 3.686611 6.0511003
1.729941 5.824996 4.525823 6.753663 0.1502718
2.556360 6.275348 2.524849 6.368483 5.4787404
2.827693 4.769902 5.120089 3.743626 4.0103449
2.988510 4.395902 2.077631 4.236895 4.6176880
3.122021 6.317305 5.413840 3.551145 5.6067027

Example2

# Sort by c3 and c4
df <- data_frame[order(data_frame$c3, data_frame$c4), ]
head(df)
c1 c2 c3 c4 c5
2.988510 4.395902 2.077631 4.236895 4.617688
2.556360 6.275348 2.524849 6.368483 5.478740
3.464516 3.914627 2.730068 9.565649 6.016123
4.233486 3.292088 3.133568 7.517309 4.772395
3.935840 2.941547 3.242078 6.464048 3.599745
3.835619 4.947859 3.335349 4.378370 7.240240

Example 3

# Sort by c3(descending) and c4(acending)
df <- data_frame[order(-data_frame$c3, data_frame$c4), ]
head(df)
c1 c2 c3 c4 c5
4.339178 4.450214 8.087243 4.5010140 8.410225
3.959420 8.105406 7.736313 7.1168936 5.431565
3.339023 3.298088 7.494285 5.9303153 7.035912
3.397036 5.382794 7.092722 0.7163620 5.620098
6.653446 4.733315 6.520536 0.9016707 4.513410
4.558559 4.712609 6.380086 6.0562703 5.044277
 

A work by Gianluca Sottile

gianluca.sottile@unipa.it