Other answers here have focused on the overall occurrence of different letters in the sequence, which may be one aspect of the "randomness" expected. However, another aspect of interest is the apparent randomness in the order of the letters in the sequence. At minimum, I would think that "randomness" entails the exchangeability of the vector of letters, which can be tested using a "runs test". The runs test counts the number of "runs" in the sequence and compares the total number of runs to its null distribution under the null hypothesis of exchangeability, for a vector with the same letters. The exact definition of what constitutes a "run" depends on the particular test (see e.g., a similar answer here), but in this case, with nominal categories, the natural definition is to count any consecutive sequence consisting of only one letter as a single "run".
For example, your sequence BABD-CABC-DACD-BACD
looks prima facie non-random to me (no letter appears with itself, which is probably unusual for a sequence this long).$^\dagger$ To test this formally, we can perform a runs test for exchangeability. In this sequence we have $n = 16$ letters (four of each letter) and there are $r = 16$ runs, each consisting of one single instance of a letter. The observed number of runs can be compared to its null distribution under the hypothesis of exchangeability. We can do this via simulation, which yields a simulated null distribution and a p-value for the test. The result for this sequence of characters is shown in the graph below.
For this sequence, the p-value for the runs test (under the null hypothesis of exchangeability) is $p=0.0537$. This is significant at the 10% significance level, but not at the 5% significance level. There is some evidence to suggest a non-exchangeable series (i.e., non-random order), but the evidence is not particularly strong. With a longer observed string, the runs test would have greater power to distinguish an exchangeable string from a non-exchangeable string. (As you can see, my initial prima facie judgment that this string is non-random may be wrong - the p-value is not actually as low as I expected it to be.)
Finally, it is important to note that this test only looks at the randomness of the order of the letters in the string - it takes the number of letters of each type as a fixed input. This test will detect non-randomness in the sense of non-exchangeability of the letters in the string, but it will not test "randomness" in the sense of overall probabilities of different letters. If the latter is also part of the specified meaning of "randomness" then this runs test could be augmented with another test that looks at the overall counts of the letters, and compares this to a hypothesised null distribution.
R code: The above plot and p-value was generated using the following R
code:
#Define the character string vector (as factors)
x <- factor(c(2,1,2,4, 3,1,2,3, 4,1,3,4, 2,1,3,4),
labels = c('A', 'B', 'C', 'D'))
#Define a function to calculate the runs for an input vector
RUNS <- function(x) { n <- length(x);
R <- 1;
for (i in 2:n) { R <- R + (x[i] != x[i-1]) }
R; }
#Simulate the runs statistic for k permutations
k <- 10^5;
set.seed(12345);
RR <- rep(0, k);
for (i in 1:k) { x_perm <- sample(x, length(x), replace = FALSE);
RR[i] <- RUNS(x_perm); }
#Generate the frequency table for the simulated runs
FREQS <- as.data.frame(table(RR));
#Calculate the p-value of the runs test
R <- RUNS(x);
R_FREQ <- FREQS$Freq[match(R, FREQS$RR)];
p <- sum(FREQS$Freq*(FREQS$Freq <= R_FREQ))/k;
#Plot estimated distribution of runs with test
library(ggplot2);
ggplot(data = FREQS, aes(x = RR, y = Freq/k, fill = (Freq <= R_FREQ))) +
geom_bar(stat = 'identity') +
geom_vline(xintercept = match(R, FREQS$RR)) +
scale_fill_manual(values = c('Grey', 'Red')) +
theme(legend.position = 'none',
plot.title = element_text(hjust = 0.5, face = 'bold'),
plot.subtitle = element_text(hjust = 0.5),
axis.title.y = element_text(margin = margin(t = 0, r = 10, b = 0, l = 0))) +
labs(title = 'Runs Test - Plot of Distribution of Runs under Exchangeability',
subtitle = paste0('(Observed runs is black line, p-value = ', p, ')'),
x = 'Runs', y = 'Estimated Probability');
$^\dagger$ I have broken the sequence up with dashes solely to make it easier to read; the dashes have no significance to the analysis.