DECILE
Data Efficient Learning with Less Data
State of the art AI and Deep Learning are very data hungry. This comes at significant cost including larger resource costs (multiple expensive GPUs and cloud costs), training times (often times multiple days), and human labeling costs and time. DECILE attempts to solve this by answering the following question: Can we train state of the art deep models with only a sample (say 5 to 10%) of massive datasets, while having negligible impact in accuracy?
Why DECILE?
Addressing critical challenges in modern AI and deep learning
💰 Reduce Training Costs
State-of-the-art deep learning requires expensive GPUs and cloud infrastructure, costing thousands per experiment.
🏷️ Lower Labeling Expenses
Manual data annotation is time-consuming and expensive, often requiring domain experts for quality labels.
⚖️ Handle Noisy Data
Real-world datasets contain noise, outliers, and class imbalances that degrade model performance.
⚡ Accelerate Development
Training on massive datasets takes days or weeks, slowing down research iteration and deployment cycles.
Our Modules
DECILE provides cutting-edge tools and libraries for data-efficient machine learning
ML Efficiency for Large Models (MeLM)
Today's world needs orders of magnitude more efficient ML to address environmental and energy crises, optimize resource consumption and improve sustainability. With the end of Moore's Law and Dennard Scaling, we can no longer expect more and faster transistors for the same cost and power budget.
PI: Ganesh Ramakrishnan
Visit MeLM Research Group →