All Unit- Data Analytic MCQs Questions

 

1. The branch of statistics that deals with the development of particular statistical methods are classified as
1. industry statistics
2. economic statistics
3. applied statistics
4. applied statistics

Answer: 3

2. Which of the following is true about regression analysis?
1. answering yes/no questions about the data
2. estimating numerical characteristics of the data
3. modeling relationships within the data
4. describing associations within the data

Answer: 3

3. Text Analytics, also referred to as Text Mining?
1. True
2. False
3. Can be true or False
4. Can not say

Answer: 1

4. What is a hypothesis?
1. A statement that the researcher wants to test through the data collected in a study.
2. A research question the results will answer.
3. A theory that underpins the study.
4. A statistical method for calculating the extent to which the results could have happened by chance.

Answer: 1

5. What is the cyclical process of collecting and analyzing data during a single research study called?
1. Interim Analysis
2. Inter analysis
3. inter-item analysis
4. constant analysis

Answer: 1

6. The process of quantifying data is referred to as ____
1. Topology
2. Diagramming
3. Enumeration
4. coding

Answer: 3

7. An advantage of using computer programs for qualitative data is that they _
1. Can reduce time required to analyse data (i.e., after the
data are transcribed)
2. Help in storing and organising data
3. Make many procedures available that are rarely done by hand due to time constraints
4. All of the above

Answer: 4

8 ______ are the basic building blocks of qualitative data.
1. Categories
2. Units
3. Individuals
4. None of the above

9. This is the process of transforming qualitative research data from written interviews or field notes into typed text.
1. Segmenting
2. Coding
3. Transcription
4. Mnemoning

Answer: 3

10. A graph that uses vertical bars to represent data is called a ___
1. Line graph
2. Bar graph
3. Scatterplot
4. Vertical graph

Answer: 2

11. ____ are used when you want to visually examine the relationship between two quantitative variables.
1. Bar graph
2. pie graph
3. line graph
4. Scatterplot

Answer: 4

12. The denominator (bottom) of the z-score formula is
1. The standard deviation
2. The difference between a score and the mean
3. The range
4. The mean

Answer: 1

13. Which of these distributions is used for a testing hypothesis?
1. Normal Distribution
2. Chi-Squared Distribution
3. Gamma Distribution
4. Poisson Distribution

Answer: 2

14. A statement made about a population for testing purpose is called?
1. Statistic
2. Hypothesis
3. Level of Significance
4. Test-Statistic

Answer: 2

15. If the assumed hypothesis is tested for rejection considering it to be true is called?
1. Null Hypothesis
2. Statistical Hypothesis
3. Simple Hypothesis
4. Composite Hypothesis

Answer: 1

16. If the null hypothesis is false then which of the following is
accepted?
1. Null Hypothesis
2. Positive Hypothesis
3. Negative Hypothesis
4. Alternative Hypothesis.

Answer: 4

17. Alternative Hypothesis is also called as?
1. Composite hypothesis
2. Research Hypothesis
3. Simple Hypothesis
4. Null Hypothesis

Answer: 2

18. Data Analysis is a process of?
A. inspecting data
B. cleaning data
C. transforming data
D. All of the above

Answer: D

19. Which of the following is not a major data analysis approaches?
A. Data Mining
B. Predictive Intelligence
C. Business Intelligence
D. Text Analytics

Answer: B

20. How many main statistical methodologies are used in data analysis?
A. 2
B. 3
C. 4
D. 5

Answer: A

21. In descriptive statistics, data from the entire population or a sample is summarized with ?
A. integer descriptors
B. floating descriptors
C. numerical descriptors
D. decimal descriptors

Answer: C

22. Data Analysis is defined by the statistician?
A. William S.
B. Hans Peter Luhn
C. Gregory Piatetsky-Shapiro
D. John Tukey

Answer: D

23. Which of the following is true about hypothesis testing?
A. answering yes/no questions about the data
B. estimating numerical characteristics of the data
C. describing associations within the data
D. modeling relationships within the data

Answer: A

24. The goal of business intelligence is to allow easy interpretation of large volumes of data to identify new opportunities.
A. TRUE
B. FALSE
C. Can be true or false
D. Can not say

Answer: A

25. The branch of statistics that deals with the development of particular statistical methods is classified as
A. industry statistics
B. economic statistics
C. applied statistics
D. mathematical statistics

Answer: D

26. Which of the following is true about regression analysis?
A. answering yes/no questions about the data
B. estimating numerical characteristics of the data
C. modeling relationships within the data
D. describing associations within the data

Answer: C

27. Text Analytics, also referred to as Text Mining?
A. TRUE
B. FALSE
C. Can be true or false
D. Can not say

Answer: A

28. What is the minimum no. of variables/ features required to perform clustering?
1. 0
2. 1
3. 2
4. 3

Answer: 2

29. For two runs of K-Mean clustering is it expected to get same clustering results?
1. Yes
2. No

Answer: 2

30. Which of the following algorithm is most sensitive to outliers?
1. K-means clustering algorithm
2. K-medians clustering algorithm
3. K-modes clustering algorithm
4. K-medoids clustering algorithm

Answer: 1

31. The discrete variables and continuous variables are two types of
1. Open end classification
2. Time series classification
3. Qualitative classification
4. Quantitative classification

Answer: 4

32. Bayesian classifiers is
1. A class of learning algorithm that tries to find an optimum classification of a set of examples using the probabilistic theory.
2. Any mechanism employed by a learning system to constrain the search space of a hypothesis
3. An approach to the design of learning algorithms that is inspired by the fact that when people encounter new situations, they often explain them by reference to familiar experiences, adapting the explanations to fit the new situation.
4. None of these

Answer: 1

33. Classification accuracy is
1. A subdivision of a set of examples into a number of classes
2. Measure of the accuracy, of the classification of a concept that is given by a certain theory
3. The task of assigning a classification to a set of examples
4. None of these

Answer: 3

34. Euclidean distance measure is
1. A stage of the KDD process in which new data is added to the existing selection.
2. The process of finding a solution for a problem simply by enumerating all possible solutions according to some predefined order and then testing them
3. The distance between two points as calculated using the Pythagoras theorem
4. none of above

Answer: 3

35. Hybrid is
1. Combining different types of method or information
2. Approach to the design of learning algorithms that is structured along the lines of the theory of evolution.
3. Decision support systems that contain an information base filled with the knowledge of an expert formulated in terms of if-then rules.
4. none of above

Answer: 1

36. Decision trees use ______ , in that they always choose the option that seems the best available at that moment.
1. Greedy Algorithms
2. divide and conquer
3. Backtracking
4. Shortest path algorithm

Answer: 1

37. Discovery is
1. It is hidden within a database and can only be recovered if one is given certain clues (an example IS encrypted information).
2. The process of executing implicit previously unknown and potentially useful information from data
3. An extremely complex molecule that occurs in human chromosomes and that carries genetic information in the form of genes.
4. None of these

Answer: 2

38. Hidden knowledge referred to
1. A set of databases from different vendors, possibly using different database paradigms
2. An approach to a problem that is not guaranteed to work but performs well in most cases
3. Information that is hidden in a database and that cannot be recovered by a simple SQL query.
4. None of these

Answer: 3

39. Enrichment is
1. A stage of the KDD process in which new data is added to the existing selection
2. The process of finding a solution for a problem simply by enumerating all possible solutions according to some predefined order and then testing them
3. The distance between two points as calculated using the Pythagoras theorem.
4. None of these

Answer: 1

40.. _____ are easy to implement and can execute efficiently even without prior knowledge of the data, they are among the most popular algorithms for classifying text documents.
1. ID3
2. Naïve Bayes classifiers
3. CART
4. None of above

Answer: 2

41. High entropy means that the partitions in classification are
1. Pure
2. Not Pure
3. Usefull
4. useless

Answer: 2

42. Which of the following statements about Naive Bayes is incorrect?
1. Attributes are equally important.
2. Attributes are statistically dependent of one another given the class value.
3. Attributes are statistically independent of one another given the class value.
4. Attributes can be nominal or numeric

Answer: 2

43. The maximum value for entropy depends on the number of classes so if we have 8 Classes what will be the max entropy.
1. Max Entropy is 1
2. Max Entropy is 2
3. Max Entropy is 3
4. Max Entropy is 4

Answer: 3

44. Point out the wrong statement.
1. k-nearest neighbor is same as k-means
2. k-means clustering is a method of vector quantization
3. k-means clustering aims to partition n observations into k clusters
4. none of the mentioned

Answer: 1

45. Consider the following example “How we can divide set of articles such that those articles have the same theme (we do not
know the theme of the articles ahead of time) ” is this:
1. Clustering
2. Classification
3. Regression
4. None of these

Answer: 1

46. Clustering techniques are ______ in the sense that the data scientist does not determine, in advance, the labels to apply to the clusters.
1. Unsupervised
2. supervised
3. Reinforcement
4. Neural network

Answer: 1

47. _____ metric is examined to determine a reasonably optimal value of k.
1. Mean Square Error
2. Within Sum of Squares (WSS)
3. Speed
4. None of these

Answer: 2

48. If an itemset is considered frequent, then any subset of the frequent itemset must also be frequent.
1. Apriori Property
2. Downward Closure Property
3. Either 1 or 2
4. Both 1 and 2

Answer: 4

49 bread,eggs,milk} has a support of 0.15 and {bread,eggs} also has a support of 0.15, the confidence of rule {bread,eggs}→{milk}   is
1. 0
2. 1
3. 2
4. 3

Answer: 1

50. ______ recommend items based on similarity measures between users and/or items.
1. Content Based Systems
2. Hybrid System
3. Collaborative Filtering Systems
4. None of these

Answer: 3

51. There are ______ major Classification of Collaborative Filtering Mechanisms
1. 1
2. 2
3. 3
4. none of above

Answer: 2

52. Movie Recommendation to people is an example of
1. User Based Recommendation
2. Item Based Recommendation
3. Knowledge Based Recommendation
4. content based recommendation

Answer: 2

53. _____ recommenders rely on an explicitely defined set of recommendation rules
1. Constraint Based
2. Case Based
3. Content Based
4. User Based

Answer: 2

Leave a Comment

Your email address will not be published. Required fields are marked *