1. Information about the Paper

Citation:

Cao, Hong, Minh Nhut Nguyen, Clifton Phua, Shonali Krishnaswamy, and Xiaoli Li. “An integrated framework for human activity classification.” In UbiComp, pp. 331-340. 2012.

Abstract

This paper presents an integrated framework to enable using standard non-sequential machine learning tools for accurate multi-modal activity recognition. We develop a novel framework that contains simple pre- and postclassification strategies to improve the overall performance. We achieve this through class-imbalance correction on the learning data using structure preserving oversampling (SPO), leveraging the sequential nature of sensory data using smoothing of the predicted label sequence and classifier fusion, respectively. Through evaluation on recent publicly available activity datasets comprising of a large amount of multi-dimensional sensory data, we demonstrate that our proposed strategies are effective in improving classification performance over common techniques such as One Nearest Neighbor (1NN) and Support Vector Machines (SVM). Our framework also shows better performance over sequential probabilistic models, such as Conditional Random Field (CRF) and Hidden Markov Model (HMM) and when these models are used as meta-learners.

 

paper-review-2

 

2. My Review of the Paper

Summary:

The paper proposed a better framework than previous learning algorithm such as: Naive Bayes, Decision Trees, Hidden Markov Model (HMM), Conditional Random Field (CRF), Nearest Neighbor (NN), Support Vector Machine (SVM).

This paper proposed an integrated framework for the following goals:

  1. To minimize the imbalance of data source. It could be in the form of missing values or too much variation because of different type of sensors used and variation from different subject of the experiment.
  2. To smoothen and remove noises and hence refine the classification result. It also added fusion algorithm to refine the result furthermore.

The proposed research problem in this paper are:

  1. Sequential classifier such as Hidden Markov Model (HMM), Conditional Random Field (CRF) are common algorithms used for learning from sequences.
  2. Meanwhile, non-sequential-classifier such as Nearest Neighbor (NN), Support Vector Machine (SVM) have good competitiveness and scalability on large dimensional and continues valued activity sensory data.
  3. Therefore, the authors proposed a new classification framework using NN and SVM combined with SPO and other classification tools, to maximize classification accuracy.

The proposed framework did outperform previous work in term of classification accuracy.

Comments

Pro:

  1. Offers significant classification accuracy improvement in the form integrated framework.
  2. It is interesting to see that the author of this paper has previously published a paper on SPO: Structure Preserving Oversampling for Imbalanced Time Series Classification, IEEE, 2011. Two of the following works is also by the same first author of this paper. It means that the author has continuing interest and expertise in this field.
  3. This paper is published in 2012, and has been cited by 8 following work by 2014. It means there has been 8 works built on this paper, one of them is a dissertation. It also means that this paper is found useful by the machine learning research community.

Cons:

  1. In case of different manufactured sensors, are not there a solution to calibrate these hardware tools?
  2. It might be a little bit difficult to be understood by some readers because there are many mathematic notations.
  3. Limitation of this research, and any interpolation method, it is not expected to work well, if the missing sensor data is 30% or more.

 Related:

Cheng, Heng-Tze, et al. “Towards zero-shot learning for human activity recognition using semantic attribute sequence model.” Proceedings of the 2013 ACM international joint conference on Pervasive and ubiquitous computing. ACM, 2013.

I think this one is the most interesting following work. It proposed to develop new learning algorithm to analyze new activities unseen before. In other word, it tried to categorize new activity without having data for this new activity before.