Skip to content

Methods

TWAS Pipeline

MyoScore is constructed by integrating genome-wide association studies (GWAS) with tissue-specific gene expression through transcriptome-wide association studies (TWAS).

27 muscle-related GWAS (>1M participants)


    FUSION TWAS (v1.1.0)

    GTEx v8 skeletal muscle eQTL weights (n = 803)
    1000 Genomes Phase 3 EUR LD reference (n = 503)


    1,116 TWAS-significant genes (Bonferroni P < 1.68 × 10⁻⁵)

    417 detectable in bulk RNA-seq (591 gene-dimension entries)


    5 dimensions → MyoScore (0–100)

GWAS Phenotypes (27 total)

DimensionPhenotypesCountDirection
StrengthGrip strength, walking pace, muscle weakness, grip cross-sectional area5Direct
MassWhole body FFM, appendicular lean mass, leg/arm/trunk FFM, etc.15Direct
LeanMuscleAnterior/posterior thigh fat infiltration2Reversed (less fat = higher score)
YouthTelomere length1Direct
ResilienceMuscular dystrophy, other myopathies, myopathy phecode, creatine kinase5Reversed (less disease = higher score)

GWAS summary statistics were obtained from OpenGWAS (IEU), UK Biobank (Neale Lab) and FinnGen release 9. All were harmonized to GRCh37/hg19 and filtered for MAF > 1% and imputation quality > 0.8.

Gene Weight and Direction Assignment

  • Gene weights: log10(PTWAS)
  • Effect direction: Determined from TWAS Z-scores, with reversals applied to negative phenotypes (fat infiltration, myopathy diagnoses, CK) so that higher expression of positively weighted genes consistently indicates better muscle health

MyoScore Calculation

  1. Normalization: Raw counts → TMM normalization (edgeR) → log₂(CPM + 1)
  2. Standardization: Gene-wise Z-score across all input samples
  3. Dimension scoring: Weighted average within each dimension:
Dimension score=(zi×di×wi)wi

where zi = Z-score, di = direction (+1 or −1), wi = gene weight 4. Scaling: Min–max normalization to 0–100 per dimension 5. Composite: MyoScore=0.252×Strength+0.177×Mass+0.243×LeanMuscle+0.242×Youth+0.087×Resilience

Mathematical Framework

Gene Weight

The weight of gene g in dimension d:

wg,d=log10(Pg,d)

where Pg,d is the minimum TWAS P value for gene g across all phenotypes assigned to dimension d.

Direction Assignment

The effect direction from TWAS Z-scores:

δg={+1if Zg>0 for positive phenotypes (e.g., grip strength)1if Zg<0 for positive phenotypes

For negative phenotypes (fat infiltration, CK, myopathy diagnoses), direction is inverted: δgfinal=δg, ensuring +1 always means higher expression → better muscle health.

Expression Preprocessing

Given raw count matrix XRG×N:

CPMg,s=Xg,sg=1GXg,s×106,Eg,s=log2(CPMg,s+1)

Gene-wise Z-score

zg,s=Eg,sE¯gσg

Dimension Sub-score

Sd,sraw=gGdzg,sδgwg,dgGdwg,d

Rescaled to [0, 100] via min–max normalization:

Sd,s=Sd,srawmins(Sd,sraw)maxs(Sd,sraw)mins(Sd,sraw)×100

Composite MyoScore

MyoScores=d=15αdSd,s

Data-Driven Weights

Dimension weights derived from absolute Spearman correlation with binary disease severity:

αd=|ρd|d=15|ρd|
DimensionWeight (αd)Spearman |ρ|
Strength0.2520.482
Mass0.1770.338
LeanMuscle0.2430.465
Youth0.2420.462
Resilience0.0870.166

Gene Counts by Dimension

DimensionTotal TWAS genesDetectable in bulk RNA-seqSource phenotypes
Strength67315
Mass38221915
LeanMuscle2721472
Youth81371
Resilience3141575
Total1,116591 entries (417 unique genes)28

Key Notation

SymbolDefinition
g, s, dGene, sample, dimension index
Pg,dTWAS P value
wg,dGene weight: log10(Pg,d)
δgDirection (+1 = healthy, −1 = disease)
zg,sZ-score-normalized expression
Sd,sDimension sub-score (0–100)
αdDimension weight (data-driven)
GdSet of detectable genes in dimension d

Pathway Enrichment by Dimension

DimensionKey Enriched PathwaysRepresentative Genes
StrengthAcetyl-CoA metabolism, ethanol metabolism, propanoate metabolismACSS2, ACSS3, SULT1A1
MassGolgi-to-plasma membrane transport
LeanMuscleProtein localization to centrosome; vesicular transport (unhealthy direction)NUDCD3, DCTN2, CEP250
YouthmRNA methyltransferase activity, MHC class I protein bindingMETTL16, TRMT61B, PILRA
ResilienceIron–sulfur cluster binding (disease direction)CISD1, CISD2, ISCU

Data Sources

CohortnSourceDescription
GTEx v8803Genotype-Tissue Expression ProjectAutopsy skeletal muscle
GEO668NCBI GEO (15 studies)Multiple myopathy and muscle studies
Helsinki Myofin154University of HelsinkiTitinopathy, IBM, control
HuashanMuscle97Huashan Hospital, Fudan UniversityDM1, LGMD, control

Total: 1,722 human skeletal muscle RNA-seq transcriptomes across four independent cohorts.

Technical Robustness

  • Cross-platform: No significant difference between DNBSEQ-T7 and NovaSeq 6000 (P = 0.37)
  • Library prep: Concordant scores between polyA selection and ribosomal depletion (P = 0.29)
  • Within-individual: Rectus femoris vs vastus lateralis correlation r = 0.60 (P = 0.032)
  • Ischemic time: Apparent correlation (r = 0.40) is Simpson's paradox; partial r = 0.14 after controlling for death type

Mendelian Randomization

Two-sample MR using skeletal muscle cis-eQTL (GTEx v8, n = 803) as instruments and UK Biobank GWAS as outcomes:

  • 28/36 gene–outcome pairs (78%) showed directions concordant with MyoScore predictions
  • Tissue specificity proved critical: blood eQTL (eQTLGen) gave discordant directions for ACSS2 and GGT7, resolved to 4/4 concordance with muscle eQTL

UK Biobank Biomarker Proxies

BiomarkerGene proxynKey finding
Plasma acetateACSS2272,474100% direction concordance across all muscle phenotypes
Serum GGTGGT7467,12375% direction concordance
最近更新