Online DP-203 Practice TestMore Microsoft Products >

Free Microsoft DP-203 Exam Dumps Questions

Microsoft DP-203: Data Engineering on Microsoft Azure

- Get instant access to DP-203 practice exam questions

- Get ready to pass the Data Engineering on Microsoft Azure exam right now using our Microsoft DP-203 exam package, which includes Microsoft DP-203 practice test plus an Microsoft DP-203 Exam Simulator.

- The best online DP-203 exam study material and preparation tool is here.

4.5 
(870 ratings)

Question 1

- (Exam Topic 3)
What should you recommend to prevent users outside the Litware on-premises network from accessing the analytical data store?

Correct Answer:A
Virtual network rules are one firewall security feature that controls whether the database server for your single databases and elastic pool in Azure SQL Database or for your databases in SQL Data Warehouse accepts communications that are sent from particular subnets in virtual networks.
Server-level, not database-level: Each virtual network rule applies to your whole Azure SQL Database server, not just to one particular database on the server. In other words, virtual network rule applies at the serverlevel, not at the database-level.
References:
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-vnet-service-endpoint-rule-overview

Question 2

- (Exam Topic 3)
Note: This question is part of a series of questions that present the same scenario. Each question in the series contains a unique solution that might meet the stated goals. Some question sets might have more than one correct solution, while others might not have a correct solution.
After you answer a question in this section, you will NOT be able to return to it. As a result, these questions will not appear in the review screen.
You plan to create an Azure Databricks workspace that has a tiered structure. The workspace will contain the following three workloads:
DP-203 dumps exhibit A workload for data engineers who will use Python and SQL.
DP-203 dumps exhibit A workload for jobs that will run notebooks that use Python, Scala, and SOL.
DP-203 dumps exhibit A workload that data scientists will use to perform ad hoc analysis in Scala and R.
The enterprise architecture team at your company identifies the following standards for Databricks environments:
DP-203 dumps exhibit The data engineers must share a cluster.
DP-203 dumps exhibit The job cluster will be managed by using a request process whereby data scientists and data engineers provide packaged notebooks for deployment to the cluster.
DP-203 dumps exhibit All the data scientists must be assigned their own cluster that terminates automatically after 120 minutes of inactivity. Currently, there are three data scientists.
You need to create the Databricks clusters for the workloads.
Solution: You create a Standard cluster for each data scientist, a High Concurrency cluster for the data engineers, and a Standard cluster for the jobs.
Does this meet the goal?

Correct Answer:B
We would need a High Concurrency cluster for the jobs. Note:
Standard clusters are recommended for a single user. Standard can run workloads developed in any language: Python, R, Scala, and SQL.
A high concurrency cluster is a managed cloud resource. The key benefits of high concurrency clusters are that they provide Apache Spark-native fine-grained sharing for maximum resource utilization and minimum query latencies.
Reference: https://docs.azuredatabricks.net/clusters/configure.html

Question 3

- (Exam Topic 3)
You have a self-hosted integration runtime in Azure Data Factory.
The current status of the integration runtime has the following configurations:
DP-203 dumps exhibit Status: Running
DP-203 dumps exhibit Type: Self-Hosted
DP-203 dumps exhibit Version: 4.4.7292.1
DP-203 dumps exhibit Running / Registered Node(s): 1/1
DP-203 dumps exhibit High Availability Enabled: False
DP-203 dumps exhibit Linked Count: 0
DP-203 dumps exhibit Queue Length: 0
DP-203 dumps exhibit Average Queue Duration. 0.00s
The integration runtime has the following node details:
DP-203 dumps exhibit Name: X-M
DP-203 dumps exhibit Status: Running
DP-203 dumps exhibit Version: 4.4.7292.1
DP-203 dumps exhibit Available Memory: 7697MB
DP-203 dumps exhibit CPU Utilization: 6%
DP-203 dumps exhibit Network (In/Out): 1.21KBps/0.83KBps
DP-203 dumps exhibit Concurrent Jobs (Running/Limit): 2/14
DP-203 dumps exhibit Role: Dispatcher/Worker
DP-203 dumps exhibit Credential Status: In Sync
Use the drop-down menus to select the answer choice that completes each statement based on the information presented.
NOTE: Each correct selection is worth one point.
DP-203 dumps exhibit
Solution:
Box 1: fail until the node comes back online We see: High Availability Enabled: False
Note: Higher availability of the self-hosted integration runtime so that it's no longer the single point of failure in your big data solution or cloud data integration with Data Factory.
Box 2: lowered We see:
Concurrent Jobs (Running/Limit): 2/14 CPU Utilization: 6%
Note: When the processor and available RAM aren't well utilized, but the execution of concurrent jobs reaches a node's limits, scale up by increasing the number of concurrent jobs that a node can run
Reference:
https://docs.microsoft.com/en-us/azure/data-factory/create-self-hosted-integration-runtime

Does this meet the goal?

Correct Answer:A

Question 4

- (Exam Topic 2)
What should you do to improve high availability of the real-time data processing solution?

Correct Answer:A
Guarantee Stream Analytics job reliability during service updates
Part of being a fully managed service is the capability to introduce new service functionality and improvements at a rapid pace. As a result, Stream Analytics can have a service update deploy on a weekly (or more frequent) basis. No matter how much testing is done there is still a risk that an existing, running job may break due to the introduction of a bug. If you are running mission critical jobs, these risks need to be avoided. You can reduce this risk by following Azure’s paired region model.
Scenario: The application development team will create an Azure event hub to receive real-time sales data, including store number, date, time, product ID, customer loyalty number, price, and discount amount, from the point of sale (POS) system and output the data to data storage in Azure
Reference:
https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-job-reliability

Question 5

- (Exam Topic 3)
You have an Azure Synapse Analytics job that uses Scala. You need to view the status of the job.
What should you do?

Correct Answer:C

Question 6

- (Exam Topic 3)
You need to implement an Azure Databricks cluster that automatically connects to Azure Data Lake Storage Gen2 by using Azure Active Directory (Azure AD) integration.
How should you configure the new cluster? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point.
DP-203 dumps exhibit
Solution:
Box 1: High Concurrency
Enable Azure Data Lake Storage credential passthrough for a high-concurrency cluster. Incorrect:
Support for Azure Data Lake Storage credential passthrough on standard clusters is in Public Preview.
Standard clusters with credential passthrough are supported on Databricks Runtime 5.5 and above and are limited to a single user.
Box 2: Azure Data Lake Storage Gen1 Credential Passthrough
You can authenticate automatically to Azure Data Lake Storage Gen1 and Azure Data Lake Storage Gen2 from Azure Databricks clusters using the same Azure Active Directory (Azure AD) identity that you use to log into Azure Databricks. When you enable your cluster for Azure Data Lake Storage credential passthrough, commands that you run on that cluster can read and write data in Azure Data Lake Storage without requiring you to configure service principal credentials for access to storage.
References:
https://docs.azuredatabricks.net/spark/latest/data-sources/azure/adls-passthrough.html

Does this meet the goal?

Correct Answer:A

START DP-203 EXAM