Question 73

- (Exam Topic 6)
You are updating the code for a subscriber to a Put/Sub feed. You are concerned that upon deployment the subscriber may erroneously acknowledge messages, leading to message loss. You subscriber is not set up to retain acknowledged messages. What should you do to ensure that you can recover from errors after deployment?

Correct Answer:B

Question 74

- (Exam Topic 3)
You need to compose visualization for operations teams with the following requirements:
Professional-Data-Engineer dumps exhibit Telemetry must include data from all 50,000 installations for the most recent 6 weeks (sampling once
every minute)
Professional-Data-Engineer dumps exhibit The report must not be more than 3 hours delayed from live data.
Professional-Data-Engineer dumps exhibit The actionable report should only show suboptimal links.
Professional-Data-Engineer dumps exhibit Most suboptimal links should be sorted to the top.
Professional-Data-Engineer dumps exhibit Suboptimal links can be grouped and filtered by regional geography.
Professional-Data-Engineer dumps exhibit User response time to load the report must be <5>You create a data source to store the last 6 weeks of data, and create visualizations that allow viewers to see multiple date ranges, distinct geographic regions, and unique installation types. You always show the latest data without any changes to your visualizations. You want to avoid creating and updating new visualizations each month. What should you do?

Correct Answer:B

Question 75

- (Exam Topic 5)
How can you get a neural network to learn about relationships between categories in a categorical feature?

Correct Answer:D
There are two problems with one-hot encoding. First, it has high dimensionality, meaning that instead of having just one value, like a continuous feature, it has many values, or dimensions. This makes computation more time-consuming, especially if a feature has a very large number of categories. The second problem is that it doesn’t encode any relationships between the categories. They are completely independent from each other, so the network has no way of knowing which ones are similar to each other.
Both of these problems can be solved by representing a categorical feature with an embedding
column. The idea is that each category has a smaller vector with, let’s say, 5 values in it. But unlike a one-hot vector, the values are not usually 0. The values are weights, similar to the weights that are used for basic features in a neural network. The difference is that each category has a set of weights (5 of them in this case).
You can think of each value in the embedding vector as a feature of the category. So, if two categories are very similar to each other, then their embedding vectors should be very similar too.
Reference:
https://cloudacademy.com/google/introduction-to-google-cloud-machine-learning-engine-course/a-wide-and-dee

Question 76

- (Exam Topic 1)
Your company’s on-premises Apache Hadoop servers are approaching end-of-life, and IT has decided to migrate the cluster to Google Cloud Dataproc. A like-for-like migration of the cluster would require 50 TB of Google Persistent Disk per node. The CIO is concerned about the cost of using that much block storage. You want to minimize the storage cost of the migration. What should you do?

Correct Answer:B

Question 77

- (Exam Topic 6)
A shipping company has live package-tracking data that is sent to an Apache Kafka stream in real time. This is then loaded into BigQuery. Analysts in your company want to query the tracking data in BigQuery to analyze geospatial trends in the lifecycle of a package. The table was originally created with ingest-date partitioning. Over time, the query processing time has increased. You need to implement a change that would improve query performance in BigQuery. What should you do?

Correct Answer:A

Question 78

- (Exam Topic 6)
A TensorFlow machine learning model on Compute Engine virtual machines (n2-standard -32) takes two days to complete framing. The model has custom TensorFlow operations that must run partially on a CPU You want to reduce the training time in a cost-effective manner. What should you do?

Correct Answer:C

START Professional-Data-Engineer EXAM