# All Unit- Data Analytic MCQs Questions

1. The branch of statistics that deals with the development of particular statistical methods are classified as
1. industry statistics
2. economic statistics
3. applied statistics
4. applied statistics

2. Which of the following is true about regression analysis?
2. estimating numerical characteristics of the data
3. modeling relationships within the data
4. describing associations within the data

3. Text Analytics, also referred to as Text Mining?
1. True
2. False
3. Can be true or False
4. Can not say

4. What is a hypothesis?
1. A statement that the researcher wants to test through the data collected in a study.
2. A research question the results will answer.
3. A theory that underpins the study.
4. A statistical method for calculating the extent to which the results could have happened by chance.

5. What is the cyclical process of collecting and analyzing data during a single research study called?
1. Interim Analysis
2. Inter analysis
3. inter-item analysis
4. constant analysis

6. The process of quantifying data is referred to as ____
1. Topology
2. Diagramming
3. Enumeration
4. coding

7. An advantage of using computer programs for qualitative data is that they _
1. Can reduce time required to analyse data (i.e., after the
data are transcribed)
2. Help in storing and organising data
3. Make many procedures available that are rarely done by hand due to time constraints
4. All of the above

8 ______ are the basic building blocks of qualitative data.
1. Categories
2. Units
3. Individuals
4. None of the above

9. This is the process of transforming qualitative research data from written interviews or field notes into typed text.
1. Segmenting
2. Coding
3. Transcription
4. Mnemoning

10. A graph that uses vertical bars to represent data is called a ___
1. Line graph
2. Bar graph
3. Scatterplot
4. Vertical graph

11. ____ are used when you want to visually examine the relationship between two quantitative variables.
1. Bar graph
2. pie graph
3. line graph
4. Scatterplot

12. The denominator (bottom) of the z-score formula is
1. The standard deviation
2. The difference between a score and the mean
3. The range
4. The mean

13. Which of these distributions is used for a testing hypothesis?
1. Normal Distribution
2. Chi-Squared Distribution
3. Gamma Distribution
4. Poisson Distribution

14. A statement made about a population for testing purpose is called?
1. Statistic
2. Hypothesis
3. Level of Significance
4. Test-Statistic

15. If the assumed hypothesis is tested for rejection considering it to be true is called?
1. Null Hypothesis
2. Statistical Hypothesis
3. Simple Hypothesis
4. Composite Hypothesis

16. If the null hypothesis is false then which of the following is
accepted?
1. Null Hypothesis
2. Positive Hypothesis
3. Negative Hypothesis
4. Alternative Hypothesis.

17. Alternative Hypothesis is also called as?
1. Composite hypothesis
2. Research Hypothesis
3. Simple Hypothesis
4. Null Hypothesis

18. Data Analysis is a process of?
A. inspecting data
B. cleaning data
C. transforming data
D. All of the above

19. Which of the following is not a major data analysis approaches?
A. Data Mining
B. Predictive Intelligence
D. Text Analytics

20. How many main statistical methodologies are used in data analysis?
A. 2
B. 3
C. 4
D. 5

21. In descriptive statistics, data from the entire population or a sample is summarized with ?
A. integer descriptors
B. floating descriptors
C. numerical descriptors
D. decimal descriptors

22. Data Analysis is defined by the statistician?
A. William S.
B. Hans Peter Luhn
C. Gregory Piatetsky-Shapiro
D. John Tukey

23. Which of the following is true about hypothesis testing?
B. estimating numerical characteristics of the data
C. describing associations within the data
D. modeling relationships within the data

24. The goal of business intelligence is to allow easy interpretation of large volumes of data to identify new opportunities.
A. TRUE
B. FALSE
C. Can be true or false
D. Can not say

25. The branch of statistics that deals with the development of particular statistical methods is classified as
A. industry statistics
B. economic statistics
C. applied statistics
D. mathematical statistics

26. Which of the following is true about regression analysis?
B. estimating numerical characteristics of the data
C. modeling relationships within the data
D. describing associations within the data

27. Text Analytics, also referred to as Text Mining?
A. TRUE
B. FALSE
C. Can be true or false
D. Can not say

28. What is the minimum no. of variables/ features required to perform clustering?
1. 0
2. 1
3. 2
4. 3

29. For two runs of K-Mean clustering is it expected to get same clustering results?
1. Yes
2. No

30. Which of the following algorithm is most sensitive to outliers?
1. K-means clustering algorithm
2. K-medians clustering algorithm
3. K-modes clustering algorithm
4. K-medoids clustering algorithm

31. The discrete variables and continuous variables are two types of
1. Open end classification
2. Time series classification
3. Qualitative classification
4. Quantitative classification

32. Bayesian classifiers is
1. A class of learning algorithm that tries to find an optimum classification of a set of examples using the probabilistic theory.
2. Any mechanism employed by a learning system to constrain the search space of a hypothesis
3. An approach to the design of learning algorithms that is inspired by the fact that when people encounter new situations, they often explain them by reference to familiar experiences, adapting the explanations to fit the new situation.
4. None of these

33. Classification accuracy is
1. A subdivision of a set of examples into a number of classes
2. Measure of the accuracy, of the classification of a concept that is given by a certain theory
3. The task of assigning a classification to a set of examples
4. None of these

34. Euclidean distance measure is
1. A stage of the KDD process in which new data is added to the existing selection.
2. The process of finding a solution for a problem simply by enumerating all possible solutions according to some predefined order and then testing them
3. The distance between two points as calculated using the Pythagoras theorem
4. none of above

35. Hybrid is
1. Combining different types of method or information
2. Approach to the design of learning algorithms that is structured along the lines of the theory of evolution.
3. Decision support systems that contain an information base filled with the knowledge of an expert formulated in terms of if-then rules.
4. none of above

36. Decision trees use ______ , in that they always choose the option that seems the best available at that moment.
1. Greedy Algorithms
2. divide and conquer
3. Backtracking
4. Shortest path algorithm

37. Discovery is
1. It is hidden within a database and can only be recovered if one is given certain clues (an example IS encrypted information).
2. The process of executing implicit previously unknown and potentially useful information from data
3. An extremely complex molecule that occurs in human chromosomes and that carries genetic information in the form of genes.
4. None of these

38. Hidden knowledge referred to
1. A set of databases from different vendors, possibly using different database paradigms
2. An approach to a problem that is not guaranteed to work but performs well in most cases
3. Information that is hidden in a database and that cannot be recovered by a simple SQL query.
4. None of these

39. Enrichment is
1. A stage of the KDD process in which new data is added to the existing selection
2. The process of finding a solution for a problem simply by enumerating all possible solutions according to some predefined order and then testing them
3. The distance between two points as calculated using the Pythagoras theorem.
4. None of these

40.. _____ are easy to implement and can execute efficiently even without prior knowledge of the data, they are among the most popular algorithms for classifying text documents.
1. ID3
2. Naïve Bayes classifiers
3. CART
4. None of above

41. High entropy means that the partitions in classification are
1. Pure
2. Not Pure
3. Usefull
4. useless

42. Which of the following statements about Naive Bayes is incorrect?
1. Attributes are equally important.
2. Attributes are statistically dependent of one another given the class value.
3. Attributes are statistically independent of one another given the class value.
4. Attributes can be nominal or numeric

43. The maximum value for entropy depends on the number of classes so if we have 8 Classes what will be the max entropy.
1. Max Entropy is 1
2. Max Entropy is 2
3. Max Entropy is 3
4. Max Entropy is 4

44. Point out the wrong statement.
1. k-nearest neighbor is same as k-means
2. k-means clustering is a method of vector quantization
3. k-means clustering aims to partition n observations into k clusters
4. none of the mentioned

45. Consider the following example “How we can divide set of articles such that those articles have the same theme (we do not
know the theme of the articles ahead of time) ” is this:
1. Clustering
2. Classification
3. Regression
4. None of these

46. Clustering techniques are ______ in the sense that the data scientist does not determine, in advance, the labels to apply to the clusters.
1. Unsupervised
2. supervised
3. Reinforcement
4. Neural network

47. _____ metric is examined to determine a reasonably optimal value of k.
1. Mean Square Error
2. Within Sum of Squares (WSS)
3. Speed
4. None of these

48. If an itemset is considered frequent, then any subset of the frequent itemset must also be frequent.
1. Apriori Property
2. Downward Closure Property
3. Either 1 or 2
4. Both 1 and 2

49 bread,eggs,milk} has a support of 0.15 and {bread,eggs} also has a support of 0.15, the confidence of rule {bread,eggs}→{milk}   is
1. 0
2. 1
3. 2
4. 3

50. ______ recommend items based on similarity measures between users and/or items.
1. Content Based Systems
2. Hybrid System
3. Collaborative Filtering Systems
4. None of these

51. There are ______ major Classification of Collaborative Filtering Mechanisms
1. 1
2. 2
3. 3
4. none of above

52. Movie Recommendation to people is an example of
1. User Based Recommendation
2. Item Based Recommendation
3. Knowledge Based Recommendation
4. content based recommendation