Sequential Data

  • When working with sequences, we assume the following

    • Non-IID assumption: The standard IID assumption does not hold for our datapoints. The data points are not necessarily independent of others in the sequence. Instead, we assume that sequences are sampled independent of each other
    • Stationarity: The sequential data is stationary if while the specific values within the sequence might change, the dynamics to which these observations do not.
    • Markov Assumption: The data is Markovian
  • For sequential data, we often consider the sequence in the natural reading order for a few reasons.

    • It is the most natural direction for us to think in.
    • Factoring in order allows us to assign probabilities to long sequences using the same model by exploiting the structure of order.
    • The ordering gives us models since there is a structure. It is not just words placed randomly.

Tasks

  • One task common task is -step ahead prediction.

    Let be a sequence of data points. The -steps ahead prediction task asks us to predict .

    This is achieved by using a model where we input , where we obtain using the same model.

  • Another task is called Unsupervised Density Modelling which may be formulated as follows:

    Given a collection of sequential data, estimate the probability mass function that tells us how likely we are to see any given sequence in this collection

Topics

Links

  • C5W3LO4 Beam Search - beam search is an algorithm similar to BFS and DFS (but is not guaranteed to find maxima), wherein given beam length , we select the top likely outputs at each step of the search. The goal is to find the likely -length sentence using this search.
  • C5W3LO4 Refining Beam Search - use length normalization techniques to optimize beam search (maximize log likelihood, average based on sentence length).