The SQL Server alert system does a great job of handling notifications for SQL Agent job failures but if want to know if a job is hung, taking longer than expected, or maybe it’s disabled and hasn’t been running, we will need to put in a bit of effort ourselves.
This first query will return the start and stop times of the last execution of a job. If the job is currently running then it will provide the duration in seconds that it has been running for. This can easily be put into a SQL Agent job to run every few minutes and send out an e-mail with jobs that have been running for longer than you’d like.
SELECT jobs.name AS [Job_Name] , CONVERT(VARCHAR(23),ja.start_execution_date,121) AS [Start_execution_date] , ISNULL(CONVERT(VARCHAR(23),ja.stop_execution_date,121), 'Is Running') AS [Stop_execution_date] , CASE ISNULL(CAST(ja.stop_execution_date AS VARCHAR(30)),'NULL') WHEN 'NULL' THEN CAST(DATEDIFF(ss,ja.start_execution_date,GETDATE()) AS VARCHAR(30)) ELSE 'Not running' END AS [Duration_secs] FROM msdb.dbo.sysjobs jobs LEFT JOIN msdb.dbo.sysjobactivity ja ON ja.job_id = jobs.job_id AND ja.start_execution_date IS NOT NULL WHERE jobs.name = 'test'
This next query will return the last execution and status for a given job, or all jobs if you comment out the WHERE clause. This can be used to monitor how long it’s been since a job last succeeded.
;WITH last_hist_rec AS ( SELECT ROW_NUMBER() OVER (PARTITION BY job_id ORDER BY run_date DESC, run_time DESC) AS [RowNum] , job_id , run_date AS [last_run_date] , run_time AS [last_run_time] , CASE run_status WHEN 0 THEN 'Failed' WHEN 1 THEN 'Succeeded' WHEN 2 THEN 'Retry' WHEN 3 THEN 'Canceled' END AS [run_status_desc] FROM msdb.dbo.sysjobhistory ) SELECT jobs.name AS [job_name] , hist.last_run_date , hist.last_run_time , hist.run_status_desc FROM msdb.dbo.sysjobs jobs LEFT JOIN last_hist_rec hist ON hist.job_id = jobs.job_id AND hist.RowNum = 1 WHERE jobs.name = 'test' --AND hist.[run_status_desc] = 'Failed' --AND hist.[run_status_desc] = 'Succeeded' --AND hist.[run_status_desc] = 'Retry' --AND hist.[run_status_desc] = 'Canceled'