AWS-Certified-Data-Analytics-Specialty Amazon Exam Questions and Free Practice Test

Question 32

A company is reading data from various customer databases that run on Amazon RDS. The databases contain many inconsistent fields For example, a customer record field that is place_id in one database is location_id in another database. The company wants to link customer records across different databases, even when many customer record fields do not match exactly
Which solution will meet these requirements with the LEAST operational overhead?

A. Create an Amazon EMR cluster to process and analyze data in the databases Connect to the Apache Zeppelin notebook, and use the FindMatches transform to find duplicate records in the data.

B. Create an AWS Glue crawler to crawl the database

C. Use the FindMatches transform to find duplicate records in the data Evaluate and tune the transform by evaluating performance and results of finding matches

D. Create an AWS Glue crawler to crawl the data in the databases Use Amazon SageMaker to construct Apache Spark ML pipelines to find duplicate records in the data

E. Create an Amazon EMR cluster to process and analyze data in the database

F. Connect to the Apache Zeppelin notebook, and use Apache Spark ML to find duplicate records in the dat

G. Evaluate and tune the model by evaluating performance and results of finding duplicates

Correct Answer:B

Question 33

A company has a data lake on AWS that ingests sources of data from multiple business units and uses Amazon Athena for queries. The storage layer is Amazon S3 using the AWS Glue Data Catalog. The company wants to make the data available to its data scientists and business analysts. However, the company first needs to manage data access for Athena based on user roles and responsibilities.
What should the company do to apply these access controls with the LEAST operational overhead?

A. Define security policy-based rules for the users and applications by role in AWS Lake Formation.

B. Define security policy-based rules for the users and applications by role in AWS Identity and Access Management (IAM).

C. Define security policy-based rules for the tables and columns by role in AWS Glue.

D. Define security policy-based rules for the tables and columns by role in AWS Identity and Access Management (IAM).

Correct Answer:D

Question 35

A company hosts an Apache Flink application on premises. The application processes data from several Apache Kafka clusters. The data originates from a variety of sources, such as web applications mobile apps and operational databases The company has migrated some of these sources to AWS and now wants to migrate the Flink application. The company must ensure that data that resides in databases within the VPC does not traverse the internet The application must be able to process all the data that comes from the company's AWS solution, on-premises resources and the public internet
Which solution will meet these requirements with the LEAST operational overhead?

A. Implement Flink on Amazon EC2 within the company's VPC Create Amazon Managed Streaming for Apache Kafka (Amazon MSK) clusters in the VPC to collect data that comes from applications and databases within the VPC Use Amazon Kinesis Data Streams to collect data that comes from the public internet Configure Flink to have sources from Kinesis Data Streams Amazon MSK and any on-premises Kafka clusters by using AWS Client VPN or AWS Direct Connect

B. Implement Flink on Amazon EC2 within the company's VPC Use Amazon Kinesis Data Streams to collect data that comes from applications and databases within the VPC and the public internet Configure Flink to have sources from Kinesis Data Streams and any on-premises Kafka clusters by using AWS Client VPN or AWS Direct Connect

C. Create an Amazon Kinesis Data Analytics application by uploading the compiled Flink jar file Use Amazon Kinesis Data Streams to collect data that comes from applications and databases within the VPC and the public internet Configure the Kinesis Data Analytics application to have sources from Kinesis Data Streams and any on-premises Kafka clusters by using AWS Client VPN or AWS Direct Connect

D. Create an Amazon Kinesis Data Analytics application by uploading the compiled Flink jar file Create Amazon Managed Streaming for Apache Kafka (Amazon MSK) clusters in the company's VPC to collect data that comes from applications and databases within the VPC Use Amazon Kinesis Data Streams to collect data that comes from the public internet Configure the Kinesis Data Analytics application to have sources from Kinesis Data Stream

E. Amazon MSK and any on-premises Kafka clusters by using AWS Client VPN or AWS Direct Connect

Correct Answer:D

Question 36

A company is sending historical datasets to Amazon S3 for storage. A data engineer at the company wants to make these datasets available for analysis using Amazon Athena. The engineer also wants to encrypt the Athena query results in an S3 results location by using AWS solutions for encryption. The requirements for encrypting the query results are as follows:
Use custom keys for encryption of the primary dataset query results. Use generic encryption for all other query results.
Provide an audit trail for the primary dataset queries that shows when the keys were used and by whom.
Which solution meets these requirements?

A. Use server-side encryption with S3 managed encryption keys (SSE-S3) for the primary datase

B. Use SSE-S3 for the other datasets.

C. Use server-side encryption with customer-provided encryption keys (SSE-C) for the primary dataset.Use server-side encryption with S3 managed encryption keys (SSE-S3) for the other datasets.

D. Use server-side encryption with AWS KMS managed customer master keys (SSE-KMS CMKs) for the primary datase

E. Use server-side encryption with S3 managed encryption keys (SSE-S3) for the other datasets.

F. Use client-side encryption with AWS Key Management Service (AWS KMS) customer managed keys for the primary datase

G. Use S3 client-side encryption with client-side keys for the other datasets.

Correct Answer:A

START AWS-Certified-Data-Analytics-Specialty EXAM

Question 31

Question 32

Question 33

Question 34

Question 35

Question 36