Questions tagged [sequence-analysis]

Analysis of DNA, RNA or protein (amino acid) sequences. Here sequences are compared so analyze similarities and conserved regions. Sequence analysis is usually done by specialized software.

132 questions
38
votes
3 answers

What is the difference between local and global sequence alignments?

There are a bunch of different alignment tools out there, and I don't want to get bogged down in the maths behind them as this not only between software but varies from software version to version. There are two main divides in the programs; some…
James
  • 11,285
  • 8
  • 56
  • 112
12
votes
1 answer

Which is the reference 16S rRNA?

Recently, I've stumbled upon a fact, which hasn't bothered me for many years. The fact is that all universal 16S primers are written as "[FR][0-9]+" (in regex notation), that is they have a position with respect to a reference. I've read through…
11
votes
1 answer

Standard practice for generating rarefaction curves from Next Generation Sequencing data

We have a few million 18S reads from a particular environment. The reads have been clustered into Operational Taxonomic Unit (OTU), and the OTUs annotated against a reference database. To generate a rarefaction curve, my understanding is that one…
11
votes
1 answer

Sequence evolution simulation tool

I'm looking for a tool to simulate sequence evolution given a specific mutation model and birth-death model. I'm aware of tools and packages like INDELible, Seq-Gen and PhyloSim, but they simulate evolution along phylogenetic trees. What I want is…
9
votes
4 answers

If we sequenced the genome of every species, would all phylogenies agree?

The Tree of Life is still up for debate. Most of this debate seems to be due to a lack of genomic information, but that deficiency is decreasing rapidly with advances in technology and sequencing power. Hypothetically, if we knew the genome of every…
9
votes
1 answer

Comment on the introduction to a bioinformatics paper

I've written a paper about DNA sequence analysis. This paper attempts to use Bayesian modelling for a set of DNA sequences. It will probably end up either in a statistics journal, or, more likely, in a bioinformatics journal. My concern is that…
Faheem Mitha
  • 723
  • 7
  • 14
8
votes
4 answers

Significance of upper-case, lower-case and Ns in UCSC DNA files

I have downloaded human chromosome's data from UCSC FTP. Some part is in small alphabets and some is in large alphabets. Does it show the coding and non-coding region? Here is an example from the file I just…
Failed Scientist
  • 1,623
  • 4
  • 15
  • 37
8
votes
3 answers

What is the difference between sequence alignment and sequence assembly?

I read the wikipedia page about sequence alignment and sequence assembly but I have not been able to find any difference between the two. What is the difference between sequence alignment and sequence assembly? If there is no different why are the…
8
votes
2 answers

RNA-Seq library construction challenges: the biases of RNA fragmentation vs cDNA fragmentation

I recently watched a presentation on RNA-seq that covered some of the choices one can make along the way, and I didn't fully understand one of the choices in particular. Near the beginning of the process, you can choose a fragmentation method (e.g.…
Jota
  • 181
  • 6
8
votes
2 answers

How to check if a fastq file has single or paired end reads

I am trying to check if a fastq file has single or paired end reads. How can I achieve this with an error-proof method? I checked wikipedia and MAQ but I want to know if is there a reliable document that describes all possible variants in sequence…
gc5
  • 808
  • 1
  • 7
  • 21
7
votes
2 answers

Comparative evolutionary study: is amino acid or nucleotide comparison more useful?

I am a high school student and am currently learning about evolutionary relationship study in biology. My teacher said that a comparative study of amino acid sequences is more useful than a comparative study of nucleotide sequences, because the…
Szeto
  • 171
  • 1
7
votes
1 answer

How can I generate a random DNA sequence?

I've found this paper which involves the construction of 19-bp random DNA sequences, but I don't know enough biology to understand how this method works. Could someone explain it to someone who is highly technical, but has knowledge of only…
vrume21
  • 71
  • 1
7
votes
2 answers

How are DNA virus cladograms actually calculated in practice? Is the procedure different for RNA viruses? Are these processes somewhat subjective?

The May 24, 2022 Bloomberg opinion piece Monkeypox Isn’t Looking Like a Covid-Sized Threat; It’s still early, but contact-tracing efforts and analysis of the virus’s genome offer hope that this outbreak can be contained. includes the following: Why…
6
votes
1 answer

What is the biological significance of k-mer counting?

There are many tools developed to compute the counts of k-mers present in a gene sequence. Jellyfish, Bloom Filter Counter, DSK Kmer Counter, KAnalyze, KMC 2 etc are some efficient software developed in last decade to count k-mers. But in what…
Enamul Hassan
  • 323
  • 1
  • 3
  • 10
6
votes
2 answers

Why is the quality range of fastq format so broad?

Referring to fastq format, it is clear that in fastq format, there are 94 quality value for a sequenced Nucleic Acid of a DNA sequence read and they…
Enamul Hassan
  • 323
  • 1
  • 3
  • 10
1
2 3
8 9