MaxEnt Bootstrapping for Biological Sequence Motifs

Much work in bioinformatics comes down to determining whether a given property of a sequence motif is “interesting” or “unusual”. To help answer such questions in a principled manner, we can define a probability distribution over all motifs subject to a constraint on mean information content. We can then sample this distribution (exactly and efficiently) in order to estimate the null distribution of said property.