There are no items in your cart
Add More
Add More
Item Details | Price |
---|
Q1 - What does it mean if a model is heteroscedastic?
A model is heteroscedastic when the variance in errors is not consistent. Alternatively, a model is homoscedastic when the variances in errors are consistent over the range of data.
Q2 - What is the meaning of selection bias?
Selection bias is basically a phenomenon that involves the selection of individual or grouped data in a way that is not considered to be random, any kind of bias in selection is present knowingly or unknowingly. If correct randomization is not achieved, then the sample will not accurately represent the population.
Q3 - What is the meaning of standard deviation?
Standard deviation basically represents the magnitude of how far the data points are from the mean. A low value of standard deviation indicates that the data is close to the mean, and a high value indicates that the data is spread to extreme
Q4 - What's the difference between Probability Mass Functions and Density Probability Functions?
Probability mass functions(PMF) are used to describe discrete probability distributions and allow us to determine the probability of an observation being exactly equal to a given value.
Probability Density functions(PDF) are used to describe continuous probability distributions and allow us to determine the probability of an observation being within a range around our target value by computing the area under the curve for our interval.
Q5 - What are observational and experimental data in Statistics?
Observational Data is data that is obtained from observational studies. Here, variables are observed to check if there’s any correlation between them. Data derived from experimental studies is known as Experimental Data. Here, certain variables are held constant to check if any inconsistencies or discrepancies are raised during the work.
Q6 - What are a few ways to handle missing data?
Q7 - What is the meaning of KPI?
KPI stands for Key Performance Indicator in statistics. It is used as a reliable metric to measure performance in various perspectives. examples of KPIs:
Q8 - What is the meaning of the five-number summary in Statistics?
The five-number summary is a measure of five entities that cover the entire range of data as shown below:
Q9 - What is the empirical rule?
In statistics, the empirical rule states that in a normal distribution, 68% of values will fall within one standard deviation of the mean, 95% will fall within two standard deviations, and 99.75 will fall within three standard deviations of the mean.
Q10 - Three ants are sitting at the corners of an equilateral triangle. Each ant randomly picks a direction and starts moving along the edge of the triangle. What is the probability that none of the ants collide?
Each ant has two possible ways to go: the edge on its left L and the edge on its right R. Now the only way no ant will collide is if they all walk in the same direction along the triangle (assuming they all move at the same speed). Overall the ways how the ants can move are:
Q11 - In probability, What's the difference between Disjoint Events and Independent Events?
A Combination is the choice of r elements from a set of n elements without replacement and where order does not matter.
A Permutation is the choice of r elements from a set of n elements without replacement and where the order matters.
Descriptive statistics summarize and organize data using means, medians, standard deviations, and graphs. Inferential statistics make predictions or inferences about a population based on a sample of data through hypothesis testing, confidence intervals, and regression analysis.
Q16 - What are Type I and Type II errors?
A Type I error occurs when you reject a true null hypothesis (false positive). A Type II error happens when you fail to reject a false null hypothesis (false negative). Balancing the risk of these errors is crucial in hypothesis testing.
Q17 - What is the Central Limit Theorem and why is it important?
The Central Limit Theorem states that the sampling distribution of the sample mean approaches a normal distribution as the sample size grows, regardless of the population's distribution, provided the samples are independent and identically distributed. It is important because it justifies the use of the normal distribution in inferential statistics.
Q18 - How do you handle missing data in a dataset?
Missing data can be handled by:
Q19 - What is multicollinearity and how can you detect and address it?
Multicollinearity occurs when independent variables in a regression model are highly correlated. It can be detected using Variance Inflation Factor (VIF) or correlation matrix. It can be addressed by:
Q20 - Explain the difference between linear regression and logistic regression.
Linear regression is used for predicting continuous outcomes and models the relationship between the dependent and independent variables with a linear equation. Logistic regression is used for binary classification problems and models the probability of a binary outcome using a logistic function.
Q21 - What is overfitting and how can you prevent it?
Overfitting occurs when a statistical model describes random error or noise instead of the underlying relationship, resulting in poor performance on new data. It can be prevented by:
Q22 - What is a confidence interval and how is it interpreted?
A confidence interval is a range of values, derived from sample statistics, that is likely to contain the value of an unknown population parameter. It is interpreted as: "We are X% confident that the true parameter lies within this interval." For example, a 95% confidence interval means there is a 95% chance that the interval contains the true parameter.