
Jab humein kisi continuous number ko predict karna ho (jaise ghar ki price ya stock market), toh hum Regression use karte hain. Lekin sirf ek "Straight Line" khinchna kafi nahi hota. Real-world data aksar "Overfit" ho jata hai ya phir bahut "Tedha" (Non-linear) hota hai.
1. Linear Regression: The Baseline
Linear Regression ek seedhi line dhoondhta hai ($y = mx + c$).
- Cost Function: Model ye dhoondhta hai ki "Galti" (Error) sabse kam kahan hai.
- Gradient Descent: Ye wo "Dheere-dheere utarne" ka rasta hai jisse model apne weights ko optimize karta hai taaki error minimum ho.
2. Polynomial Regression: Curved Trends
Har cheez line mein nahi chalti.
- Agar aapka data "S" shape ya curve mein hai, toh hum features ki power ($x^2, x^3$) badha dete hain.
- Ise Polynomial Regression kehte hain. Par dhyan rahe, zaroorat se zyada power model ko "Ziddi" (Overfit) bana sakti hai.
3. Regularization: Ridge aur Lasso
Jab model training data ko "Rat" (Memorize) leta hai, toh wo naye data par fail ho jata hai. Isse bachne ke liye hum Penalty lagate hain.
- Ridge (L2): Ye weights ko chhota karta hai par zero nahi karta.
- Lasso (L1): Ye faltu features ke weights ko Zero kar deta hai (Feature Selection).
4. Evaluation: R-squared vs Adjusted R-squared
- R-squared ($R^2$): Batata hai ki model kitna accha hai (0 se 1 tak).
- Adjusted $R^2$: Ye tab use hota hai jab hamare paas bahut saare features hon. Ye faltu features add karne par score ko "Punish" karta hai.
5. Summary Table: Regression Types
| Algorithm | Best For | Benefit |
|---|---|---|
| Simple Linear | Straight line trends | Simple & Interpretable |
| Polynomial | Curved trends | Captures non-linearity |
| Ridge (L2) | Reducing Overfitting | Keeps all features |
| Lasso (L1) | Feature Selection | Removes useless features |
FAQs
1. Gradient Descent mein "Learning Rate" kya hai? Ye wo "Step size" hai jisse model seekhta hai. Agar ye bahut bada ho toh model rasta bhatak jayega, agar bahut chhota ho toh training mein sadiyan lag jayengi.
2. Ridge aur Lasso mein se kaunsa chunu? Agar aapko lagta hai ki aapke saare features important hain, toh Ridge use karein. Agar aapko lagta hai ki kuch hi features kaam ke hain, toh Lasso use karein.
3. "Multicollinearity" kya hoti hai? Jab aapke do inputs aapas mein jude hon (e.g., Temperature in Celsius and Fahrenheit). Ye regression model ko confuse kar dete hain.
4. RMSE aur MAE mein kya fark hai? RMSE badi galtiyon ko zyada "Saza" (Square) deta hai. MAE sirf average error batata hai. AI mein RMSE zyada popular hai.
Regression Machine Learning ka "Pehla Kadam" hai. Ise samajh kar aap kisi bhi numerical trend ko predict kar sakte hain! ๐
Tarun ke baare mein: Tarun predictive modeling aur statistical regularization ke specialist hain. AI-Gyani par har prediction optimized hai.