本文主要記錄一下spring schedule調度異常。html
@Scheduled(cron = "0 0 22 * * ?") public void doSomething() throws InterruptedException { //remote http call }
從某一天開始這個調度沒有執行
查看了下日誌,沒有拋異常java
查了那個時間點的系統指標,沒啥異常,附近有個young gc,不過也就80msandroid
jstack -l pid
初步看來線程堆棧,沒看到有啥死鎖信息web
一開始有點懷疑是否是spring schedule由於線程池線程不夠,或者前面幾個時間點的定時任務執行時間過長致使這個調度任務一直沒法執行,或者致使了misfire的狀況。spring
後來線下單獨試圖去復現,發現spring schedule只要有放入隊列,即便misfire了,過了設定的調度時間點,仍是會執行的。就單單這個定時任務有問題,其餘的定時任務天天都執行好好的。segmentfault
java/util/concurrent/ScheduledThreadPoolExecutor.javaspringboot
public ScheduledThreadPoolExecutor(int corePoolSize, ThreadFactory threadFactory, RejectedExecutionHandler handler) { super(corePoolSize, Integer.MAX_VALUE, 0, NANOSECONDS, new DelayedWorkQueue(), threadFactory, handler); } public ThreadPoolExecutor(int corePoolSize, int maximumPoolSize, long keepAliveTime, TimeUnit unit, BlockingQueue<Runnable> workQueue, ThreadFactory threadFactory, RejectedExecutionHandler handler) { if (corePoolSize < 0 || maximumPoolSize <= 0 || maximumPoolSize < corePoolSize || keepAliveTime < 0) throw new IllegalArgumentException(); if (workQueue == null || threadFactory == null || handler == null) throw new NullPointerException(); this.corePoolSize = corePoolSize; this.maximumPoolSize = maximumPoolSize; this.workQueue = workQueue; this.keepAliveTime = unit.toNanos(keepAliveTime); this.threadFactory = threadFactory; this.handler = handler; }
這裏建立的是DelayedWorkQueue
java/util/concurrent/ThreadPoolExecutor.javaless
final void runWorker(Worker w) { Thread wt = Thread.currentThread(); Runnable task = w.firstTask; w.firstTask = null; w.unlock(); // allow interrupts boolean completedAbruptly = true; try { while (task != null || (task = getTask()) != null) { w.lock(); // If pool is stopping, ensure thread is interrupted; // if not, ensure thread is not interrupted. This // requires a recheck in second case to deal with // shutdownNow race while clearing interrupt if ((runStateAtLeast(ctl.get(), STOP) || (Thread.interrupted() && runStateAtLeast(ctl.get(), STOP))) && !wt.isInterrupted()) wt.interrupt(); try { beforeExecute(wt, task); Throwable thrown = null; try { task.run(); } catch (RuntimeException x) { thrown = x; throw x; } catch (Error x) { thrown = x; throw x; } catch (Throwable x) { thrown = x; throw new Error(x); } finally { afterExecute(task, thrown); } } finally { task = null; w.completedTasks++; w.unlock(); } } completedAbruptly = false; } finally { processWorkerExit(w, completedAbruptly); } }
這裏調用了task.run();
/** * Overrides FutureTask version so as to reset/requeue if periodic. */ public void run() { boolean periodic = isPeriodic(); if (!canRunInCurrentRunState(periodic)) cancel(false); else if (!periodic) ScheduledFutureTask.super.run(); else if (ScheduledFutureTask.super.runAndReset()) { setNextRunTime(); reExecutePeriodic(outerTask); } } /** * Requeues a periodic task unless current run state precludes it. * Same idea as delayedExecute except drops task rather than rejecting. * * @param task the task */ void reExecutePeriodic(RunnableScheduledFuture<?> task) { if (canRunInCurrentRunState(true)) { super.getQueue().add(task); if (!canRunInCurrentRunState(true) && remove(task)) task.cancel(false); else ensurePrestart(); } }
這裏每當執行完以後判斷下一次的執行時間,而後從新放入隊列,這樣完成周期性調度。看到這裏再去仔細看來下日誌,發現有一天的調度任務執行了沒有打印結束日誌,那麼開始懷疑是否是這個定時任務一直沒有執行完,致使接下來的幾天的調度都一直無法放入隊列執行。socket
"job-async-5" #44 prio=5 os_prio=0 tid=0x00007fb9aa7a4000 nid=0x42 runnable [0x00007fb98dfee000] java.lang.Thread.State: RUNNABLE at java.net.SocketInputStream.socketRead0(Native Method) at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) at java.net.SocketInputStream.read(SocketInputStream.java:170) at java.net.SocketInputStream.read(SocketInputStream.java:141) at java.io.BufferedInputStream.fill(BufferedInputStream.java:246) at java.io.BufferedInputStream.read1(BufferedInputStream.java:286) at java.io.BufferedInputStream.read(BufferedInputStream.java:345) - locked <0x00000006c6167e60> (a java.io.BufferedInputStream) at sun.net.www.http.HttpClient.parseHTTPHeader(HttpClient.java:704) at sun.net.www.http.HttpClient.parseHTTP(HttpClient.java:647) at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1536) - locked <0x00000006c6167eb8> (a sun.net.www.protocol.http.HttpURLConnection) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1441) - locked <0x00000006c6167eb8> (a sun.net.www.protocol.http.HttpURLConnection) at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480) at org.springframework.http.client.SimpleClientHttpResponse.getRawStatusCode(SimpleClientHttpResponse.java:52) at org.springframework.cloud.netflix.metrics.DefaultMetricsTagProvider.clientHttpRequestTags(DefaultMetricsTagProvider.java:44) at org.springframework.cloud.netflix.metrics.MetricsClientHttpRequestInterceptor.intercept(MetricsClientHttpRequestInterceptor.java:75) at org.springframework.http.client.InterceptingClientHttpRequest$InterceptingRequestExecution.execute(InterceptingClientHttpRequest.java:86) at org.springframework.http.client.InterceptingClientHttpRequest.executeInternal(InterceptingClientHttpRequest.java:70) at org.springframework.http.client.AbstractBufferingClientHttpRequest.executeInternal(AbstractBufferingClientHttpRequest.java:48) at org.springframework.http.client.AbstractClientHttpRequest.execute(AbstractClientHttpRequest.java:53) at org.springframework.web.client.RestTemplate.doExecute(RestTemplate.java:652) at org.springframework.web.client.RestTemplate.execute(RestTemplate.java:613) at org.springframework.web.client.RestTemplate.postForEntity(RestTemplate.java:407) at org.springframework.web.client.RestTemplate$$FastClassBySpringCGLIB$$aa4e9ed0.invoke(<generated>) at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:204) at org.springframework.aop.framework.CglibAopProxy$CglibMethodInvocation.invokeJoinpoint(CglibAopProxy.java:721) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:157) at org.springframework.aop.aspectj.MethodInvocationProceedingJoinPoint.proceed(MethodInvocationProceedingJoinPoint.java:85) at org.springframework.cloud.netflix.metrics.RestTemplateUrlTemplateCapturingAspect.captureUrlTemplate(RestTemplateUrlTemplateCapturingAspect.java:33) at sun.reflect.GeneratedMethodAccessor235.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:629) at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:618) at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:70) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92) at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:179) at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:656) at org.springframework.web.client.RestTemplate$$EnhancerBySpringCGLIB$$68afae5e.postForEntity(<generated>)
又在查看了沒有結束的定時任務的線程堆棧,發現確實是卡住了,不過這裏線程狀態一直是RUNNABLE.
罪魁禍首就在這個SocketInputStream.socketRead0。不過有個疑問,默認RestTemplate應該是有設置超時時間纔對的。async
spring-web-4.3.7.RELEASE-sources.jar!/org/springframework/web/client/RestTemplate.java
protected <T> T doExecute(URI url, HttpMethod method, RequestCallback requestCallback, ResponseExtractor<T> responseExtractor) throws RestClientException { Assert.notNull(url, "'url' must not be null"); Assert.notNull(method, "'method' must not be null"); ClientHttpResponse response = null; try { ClientHttpRequest request = createRequest(url, method); if (requestCallback != null) { requestCallback.doWithRequest(request); } response = request.execute(); handleResponse(url, method, response); if (responseExtractor != null) { return responseExtractor.extractData(response); } else { return null; } } catch (IOException ex) { String resource = url.toString(); String query = url.getRawQuery(); resource = (query != null ? resource.substring(0, resource.indexOf(query) - 1) : resource); throw new ResourceAccessException("I/O error on " + method.name() + " request for \"" + resource + "\": " + ex.getMessage(), ex); } finally { if (response != null) { response.close(); } } }
這裏調用了createRequest
spring-web-4.3.7.RELEASE-sources.jar!/org/springframework/http/client/support/HttpAccessor.java
private ClientHttpRequestFactory requestFactory = new SimpleClientHttpRequestFactory(); protected ClientHttpRequest createRequest(URI url, HttpMethod method) throws IOException { ClientHttpRequest request = getRequestFactory().createRequest(url, method); if (logger.isDebugEnabled()) { logger.debug("Created " + method.name() + " request for \"" + url + "\""); } return request; }
默認是SimpleClientHttpRequestFactory
spring-web-4.3.7.RELEASE-sources.jar!/org/springframework/http/client/SimpleClientHttpRequestFactory.java
private int connectTimeout = -1; private int readTimeout = -1; @Override public ClientHttpRequest createRequest(URI uri, HttpMethod httpMethod) throws IOException { HttpURLConnection connection = openConnection(uri.toURL(), this.proxy); prepareConnection(connection, httpMethod.name()); if (this.bufferRequestBody) { return new SimpleBufferingClientHttpRequest(connection, this.outputStreaming); } else { return new SimpleStreamingClientHttpRequest(connection, this.chunkSize, this.outputStreaming); } } /** * Template method for preparing the given {@link HttpURLConnection}. * <p>The default implementation prepares the connection for input and output, and sets the HTTP method. * @param connection the connection to prepare * @param httpMethod the HTTP request method ({@code GET}, {@code POST}, etc.) * @throws IOException in case of I/O errors */ protected void prepareConnection(HttpURLConnection connection, String httpMethod) throws IOException { if (this.connectTimeout >= 0) { connection.setConnectTimeout(this.connectTimeout); } if (this.readTimeout >= 0) { connection.setReadTimeout(this.readTimeout); } connection.setDoInput(true); if ("GET".equals(httpMethod)) { connection.setInstanceFollowRedirects(true); } else { connection.setInstanceFollowRedirects(false); } if ("POST".equals(httpMethod) || "PUT".equals(httpMethod) || "PATCH".equals(httpMethod) || "DELETE".equals(httpMethod)) { connection.setDoOutput(true); } else { connection.setDoOutput(false); } connection.setRequestMethod(httpMethod); }
這裏prepareConnection進行了超時設置,不過connectTimeout和readTimeout默認都是-1,至關於永遠阻塞。這就是出問題的地方。
建立resttemplate的時候,指定設置好超時時間的ClientHttpRequestFactory
若是沒有依賴http client,則默認使用的是SimpleClientHttpRequestFactory
@Bean public RestTemplate restTemplate() { SimpleClientHttpRequestFactory clientHttpRequestFactory = new SimpleClientHttpRequestFactory(); clientHttpRequestFactory.setConnectTimeout(30000); clientHttpRequestFactory.setReadTimeout(30000); RestTemplate restTemplate = new RestTemplate(clientHttpRequestFactory); return restTemplate; }
注意這裏單位是毫秒
@Bean public RestTemplate customRestTemplate(){ HttpComponentsClientHttpRequestFactory httpRequestFactory = new HttpComponentsClientHttpRequestFactory(); httpRequestFactory.setConnectionRequestTimeout(3000); httpRequestFactory.setConnectTimeout(3000); httpRequestFactory.setReadTimeout(3000); return new RestTemplate(httpRequestFactory); }
若是你設置了timeout,依然還阻塞在SocketInputStream.socketRead0,那麼請升級JDK,JDK有個bug: SocketInputStream.socketRead0 can hang even with soTimeout set。