DECILE

Data Efficient Learning with Less Data

State of the art AI and Deep Learning are very data hungry. This comes at significant cost including larger resource costs (multiple expensive GPUs and cloud costs), training times (often times multiple days), and human labeling costs and time. DECILE attempts to solve this by answering the following question: Can we train state of the art deep models with only a sample (say 5 to 10%) of massive datasets, while having negligible impact in accuracy?

Why DECILE?

Addressing critical challenges in modern AI and deep learning

💰 Reduce Training Costs

State-of-the-art deep learning requires expensive GPUs and cloud infrastructure, costing thousands per experiment.

🏷️ Lower Labeling Expenses

Manual data annotation is time-consuming and expensive, often requiring domain experts for quality labels.

⚖️ Handle Noisy Data

Real-world datasets contain noise, outliers, and class imbalances that degrade model performance.

⚡ Accelerate Development

Training on massive datasets takes days or weeks, slowing down research iteration and deployment cycles.

Our Modules

DECILE provides cutting-edge tools and libraries for data-efficient machine learning

ML Efficiency for Large Models (MeLM)

Today's world needs orders of magnitude more efficient ML to address environmental and energy crises, optimize resource consumption and improve sustainability. With the end of Moore's Law and Dennard Scaling, we can no longer expect more and faster transistors for the same cost and power budget.

PI: Ganesh Ramakrishnan

Visit MeLM Research Group →