In applying it, a sequence is modelled as an output of a discrete stochastic process, which progresses through a series of states that are ‘hidden’ from the observer. In Computational Biology, a hidden Markov model (HMM) is a statistical approach that is frequently used for modelling biological sequences. A lot of Machine Learning techniques are based on HMMs have been successfully applied to problems including speech recognition, optical character recognition, computational biology and they have become a fundamental tool in bioinformatics: for their robust statistical foundation, conceptual simplicity and malleability, they are adapted fit diverse classification problems. A good HMM accurately models the real world source of the observed real data and has the ability to simulate the source. In a HMM, the system being modelled is assumed to be a Markov process with unknown parameters, and the challenge is to determine the hidden parameters from the observable parameters. They have many applications in sequence analysis, in particular to predict exons and introns in genomic DNA, identify functional motifs (domains) in proteins (profile HMM), align two sequences (pair HMM). HMMs are statistical models to capture hidden information from observable sequential symbols (e.g., a nucleotidic sequence).
Nowadays, they are considered as a specific form of dynamic Bayesian networks, which are based on the theory of Bayes. They were first used in speech recognition and have been successfully applied to the analysis of biological sequences since late 1980s. Hidden Markov models (HMMs), named after the Russian mathematician Andrey Andreyevich Markov, who developed much of relevant statistical theory, are introduced and studied in the early 1970s. Monica Franzese, Antonella Iuliano, in Encyclopedia of Bioinformatics and Computational Biology, 2019 Introduction