Question 25

- (Topic 2)
A brainstorming session is conducted to identify the research questions to be explored within an analytics project. During the brainstorming activity which of the following should happen?

Correct Answer:D
According to the Guide to Business Data Analytics, brainstorming is a technique used to generate a large number of ideas or questions in a short period of time1. The purpose of brainstorming is to encourage creativity and divergent thinking, not to evaluate or judge the ideas or questions. Therefore, participants should avoid critiquing suggested questions raised by the group, as this could inhibit the flow of ideas and discourage participation. The other options are not consistent with the principles of brainstorming, as they could limit the quantity or quality of the questions generated. References:1: Guide to Business Data Analytics, IIBA, 2020, p. 32.

Question 26

- (Topic 1)
Insights based on the data collected indicate that a multi-national company could increase its salesof a mature product by reducing its price by 20% which would result in increased revenues of 2% over a 6-month period. The team recommends this as an appropriate goal for its organization. This is considered a good goal because:

Correct Answer:A
A well-defined objective is one that is specific, measurable, achievable, relevant, and time- bound (SMART)1. The goal of increasing sales of a mature product by reducing its price by 20% which would result in increased revenues of 2% over a 6-month period meets all these criteria, as it clearly states what the desired outcome is, how it will be measured, whether it is realistic and attainable, how it aligns with the organization??s strategy, and when it will be achieved2. References: 1: Guide to Business Data Analytics, IIBA, 2020, p. 192: SMART Goals: How to Make Your Goals Achievable, MindTools, 2021, 1.

Question 27

- (Topic 1)
An analytics team is interested in reviewing the results of a public opinion poll that is going to be conducted at the end of the month. One of the factors the team is interested in, is ensuring the result set is statistically significant. Why would this factor be important to the team?

Correct Answer:D
Ensuring the result set is statistically significant is important to the team because it means that the difference or relationship observed in the data is unlikely to be due to chance or sampling error. Statistical significance helps the team to assess the validity and reliability of their findings, and to draw meaningful conclusions and recommendations from the data.
Statistical significance also helps the team to communicate their results with confidence and credibility to the stakeholders and decision makers12 References: 1: An Easy Introduction to Statistical Significance (With Examples) - Scribbr 2: Statistical Significance in Experimentation and Data Analysis - All About Circuits

Question 28

- (Topic 2)
An analyst is performing regression analysis and reviewing the results. They would like to rescale the variables in the model to more clearly reflect the relationship between the regression coefficients.Which technique could be used to rescale the variables?

Correct Answer:C
Normalization is a technique that rescales the values of the variables in a data set to a common range, such as [0,1] or [-1,1]. Normalization can help reduce the effect of outliers, improve the performance of some algorithms, and make the interpretation of the regression coefficients easier and more consistent. Normalization can be done using different methods, such as min-max scaling, z-score scaling, or unit vector scaling. References:Guide to Business Data Analytics, page 41; Introduction to Business Data Analytics: A Practitioner View, page 12.

Question 29

- (Topic 2)
A data scientist at a consumer goods company, has been asked to do a detailed analysis on customer profiles. The Data Scientist has identified an external data source that carries valuable additional information on their customers. The data scientist also identifies the address column as the most reliable column to join the internal data source with the external data source. Addresses may appear in different formats for example:
File A = "13 Smith St"
File B = "Unit 7, 13 Smith Street"
Which of the following techniques would be useful in this situation?

Correct Answer:B
Probabilistic linkage is a technique that uses statistical methods to match records from different data sources based on the similarity of key variables, such as name, address, date of birth, etc1. Probabilistic linkage can handle variations, errors, or missing values in the data, and assign a score or probability to each potential match2. Probabilistic linkage would be useful in this situation, as the address column may have different formats, spellings, or abbreviations in the internal and external data sources, and a deterministic linkage (which requires exact matches) might miss some valid matches or create false matches.
Deterministic linkage is a technique that uses predefined rules or criteria to match records from different data sources based on the exact agreement of key variables, such as identifiers, codes, or hashes3. Deterministic linkage would not be useful in this situation, as the address column may not have consistent or unique values in the internal and external data sources, and a probabilistic linkage (which allows for some variation or uncertainty) might find more accurate matches or avoid false matches.
Genetic linkage is a term used in genetics to describe the tendency of genes or DNA sequences that are located close together on a chromosome to be inherited together4. Genetic linkage is not relevant to this situation, as it has nothing to do with matching records from different data sources based on the address column.
Cuff linkage is a term used in sewing to describe the process of attaching a cuff to a sleeve by stitching or fastening. Cuff linkage is not relevant to this situation, as it has nothing to do with matching records from different data sources based on the address column. References:1: Guide to Business Data Analytics, IIBA, 2020, p. 452: Data Linkage: The Definitive Guide, Tableau, 3: Guide to Business Data Analytics, IIBA, 2020, p. 454: Genetic Linkage, National Human Genome Research Institute, . : Cuff Linkage, Sewing Dictionary, .
: Data Linkage: The Definitive Guide, Tableau, . : Genetic Linkage, National Human Genome Research Institute, . : Cuff Linkage, Sewing Dictionary, .

Question 30

- (Topic 2)
A data dictionary is being developed for a dataset describing a company's customer base. Within the data dictionary, which of the following represents a composite data element?

Correct Answer:A
A composite data element is a data element that is made up of smaller units called sub-elements, which are separated by a sub-element separator character, such as a colon (:). For example, ITEMNO is a composite data element that consists of three sub- elements: part number, aisle number, and bin number. A street address is also a composite data element that can consist of sub-elements such as street number, street name, city, state, and zip code. First name, total sale, and birthdate are simple data elements that do not have sub-elements.
References:Data Elements - IBM, UN/EDIFACT Syntax Rules

START CBDA EXAM