Large AI models have become powerful but increasingly impractical; with escalating training costs, bloated memory requirements, and latency bottlenecks that limit real-world deployments. This talk introduces CompactifAI: a quantum-inspired compression framework that uses tensor networks to surgically shrink large models while preserving their accuracy and capabilities.