↓ Skip to main content

Stochastic convex sparse principal component analysis

Overview of attention for article published in EURASIP Journal on Bioinformatics & Systems Biology, September 2016
Altmetric Badge

Mentioned by

twitter
1 X user

Citations

dimensions_citation
3 Dimensions

Readers on

mendeley
14 Mendeley
Title
Stochastic convex sparse principal component analysis
Published in
EURASIP Journal on Bioinformatics & Systems Biology, September 2016
DOI 10.1186/s13637-016-0045-x
Pubmed ID
Authors

Inci M. Baytas, Kaixiang Lin, Fei Wang, Anil K. Jain, Jiayu Zhou

Abstract

Principal component analysis (PCA) is a dimensionality reduction and data analysis tool commonly used in many areas. The main idea of PCA is to represent high-dimensional data with a few representative components that capture most of the variance present in the data. However, there is an obvious disadvantage of traditional PCA when it is applied to analyze data where interpretability is important. In applications, where the features have some physical meanings, we lose the ability to interpret the principal components extracted by conventional PCA because each principal component is a linear combination of all the original features. For this reason, sparse PCA has been proposed to improve the interpretability of traditional PCA by introducing sparsity to the loading vectors of principal components. The sparse PCA can be formulated as an ℓ 1 regularized optimization problem, which can be solved by proximal gradient methods. However, these methods do not scale well because computation of the exact gradient is generally required at each iteration. Stochastic gradient framework addresses this challenge by computing an expected gradient at each iteration. Nevertheless, stochastic approaches typically have low convergence rates due to the high variance. In this paper, we propose a convex sparse principal component analysis (Cvx-SPCA), which leverages a proximal variance reduced stochastic scheme to achieve a geometric convergence rate. We further show that the convergence analysis can be significantly simplified by using a weak condition which allows a broader class of objectives to be applied. The efficiency and effectiveness of the proposed method are demonstrated on a large-scale electronic medical record cohort.

X Demographics

X Demographics

The data shown below were collected from the profile of 1 X user who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 14 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Panama 1 7%
Unknown 13 93%

Demographic breakdown

Readers by professional status Count As %
Other 2 14%
Student > Doctoral Student 2 14%
Student > Bachelor 1 7%
Professor 1 7%
Student > Ph. D. Student 1 7%
Other 3 21%
Unknown 4 29%
Readers by discipline Count As %
Mathematics 4 29%
Engineering 2 14%
Nursing and Health Professions 1 7%
Medicine and Dentistry 1 7%
Computer Science 1 7%
Other 0 0%
Unknown 5 36%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 1. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 24 September 2016.
All research outputs
#20,674,485
of 25,394,764 outputs
Outputs from EURASIP Journal on Bioinformatics & Systems Biology
#33
of 53 outputs
Outputs of similar age
#264,744
of 340,279 outputs
Outputs of similar age from EURASIP Journal on Bioinformatics & Systems Biology
#4
of 5 outputs
Altmetric has tracked 25,394,764 research outputs across all sources so far. This one is in the 10th percentile – i.e., 10% of other outputs scored the same or lower than it.
So far Altmetric has tracked 53 research outputs from this source. They receive a mean Attention Score of 3.1. This one is in the 22nd percentile – i.e., 22% of its peers scored the same or lower than it.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 340,279 tracked outputs that were published within six weeks on either side of this one in any source. This one is in the 11th percentile – i.e., 11% of its contemporaries scored the same or lower than it.
We're also able to compare this research output to 5 others from the same source and published within six weeks on either side of this one.