Cesar Augusto Vargas-García

Computational Discovery of Cancer Drug Resistance Networks

← Back to Research

Publication

🧬 💊 Rare cell variability and drug-induced reprogramming as a mode of cancer drug resistance

Nature (2017) | Read Paper | Press Coverage | Network Analysis | Mathematical Methods


The Challenge

Cancer patients often initially respond well to targeted drugs, but tumors almost inevitably develop resistance, leading to treatment failure. Understanding how this resistance emerges is crucial for developing better therapies. We discovered that rare cancer cells can coordinately express multiple resistance genes, but the key question was: which genes are the controllers and which are just followers?

Our Computational Innovation: The φ-Mixing Coefficient

What Makes It Special

Traditional methods for analyzing gene networks have critical limitations:

The φ-mixing coefficient overcomes all these limitations.

Intuitive Explanation

Imagine you’re studying a company to understand decision-making:

Correlation approach: “The CEO, managers, and employees all arrive at 9 AM” → Everyone seems equally important

φ-coefficient approach: “When the CEO decides to start at 8 AM, 90% of employees shift their schedule. When employees come early, only 5% of the time does the CEO change schedule” → Clear hierarchy revealed!

Mathematical Insight

φ(Gene B | Gene A) = Maximum over all states of:
|P(Gene B = ON | Gene A state) - P(Gene B = ON)|

This measures how much knowing Gene A’s state changes our prediction of Gene B.

Key Properties

  1. Asymmetric: φ(A→B) ≠ φ(B→A), revealing direction of influence
  2. Range [0,1]: 0 = no influence, 1 = complete control
  3. Conditional Independence: Can detect and remove indirect effects

Our Implementation

I developed binPhix, a MATLAB implementation of the φ-mixing algorithm specifically optimized for binary single-cell RNA data:

% Core algorithm steps
1. Start with all possible connections (n genes  n(n-1) edges)
2. Compute φ coefficient for each directed edge
3. Prune indirect connections using conditional independence
4. Result: True regulatory network

Technical Innovations


Key Discoveries

From 342 Possible Connections to 68 True Influences

80% of apparent gene correlations were indirect effects!

The Master Regulators

Our analysis revealed just 4 key upstream genes controlling the resistance program:

  1. NRG1 - Neuregulin 1
    • Directly controls: VEGFC, AXL, JUN, WNT5A, LOXL2
    • Creates multiple feedforward loops
  2. RUNX2 - Runt-related transcription factor 2
    • Orchestrates transcriptional programs
    • No incoming regulatory edges
  3. EGFR - Epidermal growth factor receptor
    • Weak but significant control over WNT5A, JUN
  4. PDGFRB - Platelet-derived growth factor receptor β
    • Multiple weak outgoing edges

Biological Impact

This discovery suggests that targeting these 4 master regulators could prevent the emergence of drug resistance, rather than trying to target all 19 resistance markers individually.


Why This Matters

For Cancer Treatment

For Computational Biology

For Future Research


Visual Summary

Traditional View (Correlation):
All 19 resistance genes seem equally important
🔴↔️🔴↔️🔴↔️🔴↔️🔴... (342 connections)

Our Discovery (φ-mixing):
4 master regulators control the rest
    NRG1 ──→ VEGFC
      ├────→ AXL
      ├────→ JUN      
      └────→ WNT5A
           └──→ LOXL2

Code Availability

The binPhix algorithm implementation is available as part of the Nature publication supplementary materials. For researchers interested in applying this method to their own single-cell data, please refer to the detailed implementation guide in the supplementary information.


Collaborators

This work was a collaboration with the University of Pennsylvania, including Sydney M. Shaffer, Margaret C. Dunagin, Stefan R. Torborg, Eduardo A. Torre, Benjamin Emert, Clemens Krepler, and Arjun Raj.


← Back to Research View Publication Press Coverage