When Data Do Not Conform to Rows and Columns

Predictive modelling over non-conforming features

Similar projects worth following
  • Description
  • Details
  • Files 0
  • Components 0
  • Logs 0
  • Instructions 0
  • Discussion 0
Most predictive models assume the features are arranged into rows and columns, like a spreadsheet, but many kinds of data do not conform to this structure. Sequences are one example of a different kind of data, which is why this data is usually stored in a text document, not a spreadsheet. To build predictive models for sequences and other non-conforming features, we have developed what we call dynamic kernel matching (DKM).

We can think of DKM as directly analogous to a convolutional neural network (CNN), but for non-conforming features. DKM finds the arrangement of features that exhibit the maximal response, like how max pooling identifies the image patch that exhibit the maximal response in a CNN. To find arrangement of features that exhibit the maximal response, we use alignment algorithms.

We apply DKM to two datasets of T-cell receptors to diagnose disease from the T-cell receptor sequences, showing that DKM works on some really hard problems!

Enjoy this project?



Similar Projects

Does this project spark your interest?

Become a member to follow this project and never miss any updates