1. How can supervised learning be distinguished from unsupervised learning?
Ans:
Supervised learning works with labeled datasets, allowing models to learn the relationship between inputs and outputs for accurate predictions or classifications. Unsupervised learning uses unlabeled data to identify patterns, clusters, or hidden structures. This method is especially helpful for segmenting users or discovering natural groupings within data.
2. How does transfer learning enhance machine learning models?
Ans:
Transfer learning improves model performance by adapting a system trained on a large dataset to a smaller, specialized task. This approach reduces training time, optimizes resource usage and helps models perform effectively even with limited data. Leveraging pre-learned knowledge allows better results in specific applications.
3. What is overfitting in machine learning and why is it an issue?
Ans:
Overfitting occurs when a model captures not only meaningful patterns but also noise and irrelevant details from training data, leading to poor performance on unseen data. Techniques like regularization, pruning and cross-validation mitigate overfitting. Maintaining balance ensures the model generalizes well across different datasets.
4. What are GANs and how do they operate?
Ans:
Generative Adversarial Networks include two neural networks: a generator that produces synthetic data and a discriminator that evaluates its authenticity. These networks train competitively, pushing the generator to produce increasingly realistic outputs. GANs are widely used for image synthesis, data augmentation and creative content generation.
5. How do attention mechanisms enhance transformer models?
Ans:
Attention mechanisms enable transformers to focus on the most relevant parts of an input sequence. Self-attention allows each word or element to reference others, capturing context and relationships efficiently. This capability significantly improves accuracy in tasks such as language understanding, translation and AI content generation.
6. What role does feature engineering play in machine learning?
Ans:
Feature engineering transforms raw data into meaningful inputs by selecting, refining, or creating new variables. Well-designed features help algorithms detect critical patterns, improve prediction accuracy and support better generalization. Poorly chosen features, however, can limit a model’s learning capability.
7. How is missing data typically handled in datasets?
Ans:
Missing data can be addressed through imputation methods like replacing values with the mean, median, or mode, or by using algorithms that manage missing entries naturally. In cases of excessive missing data, removing affected rows or columns may be appropriate. The approach depends on the dataset’s size and the impact of missing information.
8. How do Random Forest and XGBoost differ in methodology?
Ans:
Random Forest builds multiple independent decision trees and averages their results to reduce overfitting and variance. XGBoost constructs trees sequentially, with each tree correcting errors from the previous one, achieving strong predictive accuracy. Both are effective but employ distinct learning strategies for problem-solving.
9. How is the performance of a machine learning model evaluated?
Ans:
Model evaluation depends on the task type. Classification tasks use metrics such as accuracy, precision, recall, F1-score and ROC AUC, while regression tasks rely on measures like mean squared error, mean absolute error and R-squared. These metrics help determine the model’s reliability and suitability for real-world deployment.
10. What ethical factors should be considered when designing AI systems?
Ans:
Ethical AI requires fairness by minimizing bias in training data, transparency in model operations and protection of user privacy. Building accountable and explainable systems ensures trust and prevents misuse. Ethical design aligns AI solutions with societal values and ensures safe, responsible deployment.