用solr作項目已經有一年有餘,但都是使用層面,只是利用solr現有機制,修改參數,而後監控調優,從沒有對solr進行源碼級別的研究。可是,最近手頭的一個項目,讓我感受必須把solrn內部原理和擴展機制弄熟,才能把這個項目作好。今天分享的就是:Solr是如何啓動而且初始化的。你們知道,部署solr時,分兩部分:1、solr的配置文件。2、solr相關的程序、插件、依賴lucene相關的jar包、日誌方面的jar。所以,在研究solr也能夠順着這個思路:加載配置文件、初始化各個core、初始化各個core中的requesthandler...html
研究solr的啓動,首先從solr war程序的web.xml分析開始,下面是solr的web.xml片斷:java
<web-app xmlns="http://java.sun.com/xml/ns/javaee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_2_5.xsd" version="2.5" metadata-complete="true" > <!-- Uncomment if you are trying to use a Resin version before 3.0.19. Their XML implementation isn't entirely compatible with Xerces. Below are the implementations to use with Sun's JVM. <system-property javax.xml.xpath.XPathFactory= "com.sun.org.apache.xpath.internal.jaxp.XPathFactoryImpl"/> <system-property javax.xml.parsers.DocumentBuilderFactory= "com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderFactoryImpl"/> <system-property javax.xml.parsers.SAXParserFactory= "com.sun.org.apache.xerces.internal.jaxp.SAXParserFactoryImpl"/> --> <!-- People who want to hardcode their "Solr Home" directly into the WAR File can set the JNDI property here... --> <!-- Solr配置文件的參數,用於Solr初始化使用 --> <env-entry> <env-entry-name>solr/home</env-entry-name> <env-entry-value>R:/solrhome1/solr</env-entry-value> <env-entry-type>java.lang.String</env-entry-type> </env-entry> <!-- org.apache.solr.servlet.SolrDispatchFilter Solr啓動最重要的東東,因此針對solr源碼分析,要對這個Filter開始,它主要的做用:加載solr配置文件、初始化各個core、初始化各個requestHandler和component --> <filter> <filter-name>SolrRequestFilter</filter-name> <filter-class>org.apache.solr.servlet.SolrDispatchFilter</filter-class> <!-- If you are wiring Solr into a larger web application which controls the web context root, you will probably want to mount Solr under a path prefix (app.war with /app/solr mounted into it, for example). You will need to put this prefix in front of the SolrDispatchFilter url-pattern mapping too (/solr/*), and also on any paths for legacy Solr servlet mappings you may be using. For the Admin UI to work properly in a path-prefixed configuration, the admin folder containing the resources needs to be under the app context root named to match the path-prefix. For example: .war xxx js main.js --> <!-- <init-param> <param-name>path-prefix</param-name> <param-value>/xxx</param-value> </init-param> --> </filter>
SolrDispatchFilter 是繼承BaseSolrFilter的一個Filter(Filter的做用是啥,你們應該清楚吧,通常web框架級別的產品源碼分析都是從filter或者servlet開始)。在介紹SolrDispatchFilter以前,先介紹一下BaseSolrFilter(也許程序員都有刨根問底的習慣)。BaseSolrFilter,是一個實現Filter接口的抽象類,功能很簡單,就是判斷當前程序是否已經加載日誌方面的jar。代碼片斷以下:node
/** * All Solr filters available to the user's webapp should * extend this class and not just implement {@link Filter}. * This class ensures that the logging configuration is correct * before any Solr specific code is executed. */ abstract class BaseSolrFilter implements Filter { static {// CheckLoggingConfiguration.check(); } }
着於篇幅,我就不介紹CheckLoggingConfiguration.check() 這裏面的東東了。OK,咱們回到SolrDispatchFilter上。因爲BaseSolrFilter是一個抽象類,全部做爲非抽象類的SolrDispatchFilter必需要實現Filter接口。Filter接口以下:程序員
public interface Filter { //進行初始化 public void init(FilterConfig filterConfig) throws ServletException; //攔截全部的http請求 public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException; //進行註銷的動做 public void destroy(); }
根據上面的註釋,咱們知道在init方法中是進行初始化的。所以,今天我們研究SolrDispatchFilter是如何初始化,是離不開這個方法的。接下來,我們看看SolrDispatchFilter的init方法吧:web
@Override public void init(FilterConfig config) throws ServletException { log.info("SolrDispatchFilter.init()"); try { // web.xml configuration this.pathPrefix = config.getInitParameter( "path-prefix" ); //各位看客,乾坤盡在此方法中 this.cores = createCoreContainer(); log.info("user.dir=" + System.getProperty("user.dir")); } catch( Throwable t ) { // catch this so our filter still works log.error( "Could not start Solr. Check solr/home property and the logs"); SolrCore.log( t ); if (t instanceof Error) { throw (Error) t; } } log.info("SolrDispatchFilter.init() done"); }
我們順藤摸瓜,來看看createCoreContainer這個方法到底幹了些什麼。apache
protected CoreContainer createCoreContainer() {
//看好了SolrResourceLoader 是用來加載solr home中的配置文件文件的 SolrResourceLoader loader = new SolrResourceLoader(SolrResourceLoader.locateSolrHome()); //加載配置文件
ConfigSolr config = loadConfigSolr(loader); CoreContainer cores = new CoreContainer(loader, config);
//初始化Core cores.load(); return cores; }
createCoreContainer這個方法是決定我們今天可否弄懂Solr初始化和啓動的關鍵。咱們順便簡單分析一下這個方法中用到的幾個類和方法:api
SolrResourceLoader 類如其名,是solr資源加載器。緩存
ConfigSolr 是經過SolrResourceLoader來讀取solr配置文件的中信息的。多線程
loadConfigSolr,加載配置信息的方法:app
private ConfigSolr loadConfigSolr(SolrResourceLoader loader) { //優先讀取solr.solrxml.location配置的信息,每每是經過讀取zookeeper中的配置信息進行初始化的,若是沒有配置,就會讀取solrhome配置項配置的信息(記得web.xml第一個配置項否,就是它) String solrxmlLocation = System.getProperty("solr.solrxml.location", "solrhome"); if (solrxmlLocation == null || "solrhome".equalsIgnoreCase(solrxmlLocation)) return ConfigSolr.fromSolrHome(loader, loader.getInstanceDir()); //ok 從zookeeper中讀取配置信息吧,這是在solrcloud集羣下用來solr初始化的 if ("zookeeper".equalsIgnoreCase(solrxmlLocation)) { String zkHost = System.getProperty("zkHost"); log.info("Trying to read solr.xml from " + zkHost); if (StringUtils.isEmpty(zkHost)) throw new SolrException(ErrorCode.SERVER_ERROR, "Could not load solr.xml from zookeeper: zkHost system property not set"); SolrZkClient zkClient = new SolrZkClient(zkHost, 30000); try { if (!zkClient.exists("/solr.xml", true))//solr.xml裏有描述的zookeeper相關的配置信息 throw new SolrException(ErrorCode.SERVER_ERROR, "Could not load solr.xml from zookeeper: node not found"); byte[] data = zkClient.getData("/solr.xml", null, null, true);
//加載配置信息 return ConfigSolr.fromInputStream(loader, new ByteArrayInputStream(data)); } catch (Exception e) { throw new SolrException(ErrorCode.SERVER_ERROR, "Could not load solr.xml from zookeeper", e); } finally { zkClient.close();//關閉zookeeper鏈接 } } throw new SolrException(ErrorCode.SERVER_ERROR, "Bad solr.solrxml.location set: " + solrxmlLocation + " - should be 'solrhome' or 'zookeeper'"); }
CoreContainer 就是進行Core初始化工做的。咱們主要看看load方法吧,這段方法有點長,代碼以下:
public void load() { log.info("Loading cores into CoreContainer [instanceDir={}]", loader.getInstanceDir()); //加載solr共享jar包庫 // add the sharedLib to the shared resource loader before initializing cfg based plugins String libDir = cfg.getSharedLibDirectory(); if (libDir != null) { File f = FileUtils.resolvePath(new File(solrHome), libDir); log.info("loading shared library: " + f.getAbsolutePath());
//對classloader不熟的,能夠進去看看 loader.addToClassLoader(libDir, null, false); loader.reloadLuceneSPI(); } //分片相關的handler加載以及初始化 shardHandlerFactory = ShardHandlerFactory.newInstance(cfg.getShardHandlerFactoryPluginInfo(), loader); updateShardHandler = new UpdateShardHandler(cfg); solrCores.allocateLazyCores(cfg.getTransientCacheSize(), loader); logging = LogWatcher.newRegisteredLogWatcher(cfg.getLogWatcherConfig(), loader); hostName = cfg.getHost(); log.info("Host Name: " + hostName); zkSys.initZooKeeper(this, solrHome, cfg); collectionsHandler = createHandler(cfg.getCollectionsHandlerClass(), CollectionsHandler.class); infoHandler = createHandler(cfg.getInfoHandlerClass(), InfoHandler.class); coreAdminHandler = createHandler(cfg.getCoreAdminHandlerClass(), CoreAdminHandler.class); //zookeeper 配置信息初始化solr core coreConfigService = cfg.createCoreConfigService(loader, zkSys.getZkController()); containerProperties = cfg.getSolrProperties("solr"); // setup executor to load cores in parallel // do not limit the size of the executor in zk mode since cores may try and wait for each other.
//多線程初始化core 不熟悉多線的能夠駐足研究一會 ExecutorService coreLoadExecutor = Executors.newFixedThreadPool( ( zkSys.getZkController() == null ? cfg.getCoreLoadThreadCount() : Integer.MAX_VALUE ), new DefaultSolrThreadFactory("coreLoadExecutor") ); try { CompletionService<SolrCore> completionService = new ExecutorCompletionService<>( coreLoadExecutor); Set<Future<SolrCore>> pending = new HashSet<>(); List<CoreDescriptor> cds = coresLocator.discover(this); checkForDuplicateCoreNames(cds); for (final CoreDescriptor cd : cds) { final String name = cd.getName(); try { if (cd.isTransient() || ! cd.isLoadOnStartup()) { // Store it away for later use. includes non-transient but not // loaded at startup cores. solrCores.putDynamicDescriptor(name, cd); } if (cd.isLoadOnStartup()) { // The normal case Callable<SolrCore> task = new Callable<SolrCore>() { @Override public SolrCore call() { SolrCore c = null; try { if (zkSys.getZkController() != null) {//zookeeper模式 preRegisterInZk(cd); } c = create(cd);//普通建立模式 registerCore(cd.isTransient(), name, c, false, false); } catch (Exception e) { SolrException.log(log, null, e); try { /* if (isZooKeeperAware()) { try { zkSys.zkController.unregister(name, cd); } catch (InterruptedException e2) { Thread.currentThread().interrupt(); SolrException.log(log, null, e2); } catch (KeeperException e3) { SolrException.log(log, null, e3); } }*/ } finally { if (c != null) { c.close(); } } } return c; } }; pending.add(completionService.submit(task)); } } catch (Exception e) { SolrException.log(log, null, e); } } while (pending != null && pending.size() > 0) { try { //獲取建立完成的core Future<SolrCore> future = completionService.take(); if (future == null) return; pending.remove(future); try { SolrCore c = future.get(); // track original names if (c != null) { solrCores.putCoreToOrigName(c, c.getName()); } } catch (ExecutionException e) { SolrException.log(SolrCore.log, "Error loading core", e); } } catch (InterruptedException e) { throw new SolrException(SolrException.ErrorCode.SERVICE_UNAVAILABLE, "interrupted while loading core", e); } }
//solr core的守護線程,在容器關閉或者啓動失敗的時候,進行資源註銷 // Start the background thread backgroundCloser = new CloserThread(this, solrCores, cfg); backgroundCloser.start(); } finally { if (coreLoadExecutor != null) {
//初始化完成,關閉線程池 ExecutorUtil.shutdownNowAndAwaitTermination(coreLoadExecutor); } } if (isZooKeeperAware()) {//若是zookeeper可用 也就是solrcloud模式 // register in zk in background threads Collection<SolrCore> cores = getCores(); if (cores != null) { for (SolrCore core : cores) { try {
//講core的狀態信息註冊到zookeeper中 zkSys.registerInZk(core, true); } catch (Throwable t) { SolrException.log(log, "Error registering SolrCore", t); } } }
// zkSys.getZkController().checkOverseerDesignate(); } }
在這段代碼,關鍵部分我都作了註釋。當你須要優化你的solr啓動速度時,你還會來研究這段代碼。下面,咱們將研究solr的請求過濾處理的部分,咱們須要關注doFilter那個方法了(關鍵部分我做以註釋,就不細講了):
if( abortErrorMessage != null ) {//500錯誤處理 ((HttpServletResponse)response).sendError( 500, abortErrorMessage ); return; } if (this.cores == null) {//solr core初始化失敗或者已經關閉 ((HttpServletResponse)response).sendError( 503, "Server is shutting down or failed to initialize" ); return; } CoreContainer cores = this.cores; SolrCore core = null; SolrQueryRequest solrReq = null; Aliases aliases = null; if( request instanceof HttpServletRequest) {//若是是http請求 HttpServletRequest req = (HttpServletRequest)request; HttpServletResponse resp = (HttpServletResponse)response; SolrRequestHandler handler = null; String corename = ""; String origCorename = null; try { // put the core container in request attribute req.setAttribute("org.apache.solr.CoreContainer", cores); String path = req.getServletPath(); if( req.getPathInfo() != null ) { // this lets you handle /update/commit when /update is a servlet path += req.getPathInfo(); } if( pathPrefix != null && path.startsWith( pathPrefix ) ) { path = path.substring( pathPrefix.length() ); } // check for management path String alternate = cores.getManagementPath(); if (alternate != null && path.startsWith(alternate)) { path = path.substring(0, alternate.length()); } // unused feature ? int idx = path.indexOf( ':' ); if( idx > 0 ) { // save the portion after the ':' for a 'handler' path parameter path = path.substring( 0, idx ); } // Check for the core admin page if( path.equals( cores.getAdminPath() ) ) {//solr admin 管理頁面請求 handler = cores.getMultiCoreHandler(); solrReq = SolrRequestParsers.DEFAULT.parse(null,path, req); handleAdminRequest(req, response, handler, solrReq); return; } boolean usingAliases = false; List<String> collectionsList = null; // Check for the core admin collections url if( path.equals( "/admin/collections" ) ) {//管理collections handler = cores.getCollectionsHandler(); solrReq = SolrRequestParsers.DEFAULT.parse(null,path, req); handleAdminRequest(req, response, handler, solrReq); return; } // Check for the core admin info url if( path.startsWith( "/admin/info" ) ) {//查看admin info handler = cores.getInfoHandler(); solrReq = SolrRequestParsers.DEFAULT.parse(null,path, req); handleAdminRequest(req, response, handler, solrReq); return; } else { //otherwise, we should find a core from the path idx = path.indexOf( "/", 1 ); if( idx > 1 ) { // try to get the corename as a request parameter first corename = path.substring( 1, idx ); // look at aliases if (cores.isZooKeeperAware()) {//solr cloud狀態 origCorename = corename; ZkStateReader reader = cores.getZkController().getZkStateReader(); aliases = reader.getAliases(); if (aliases != null && aliases.collectionAliasSize() > 0) { usingAliases = true; String alias = aliases.getCollectionAlias(corename); if (alias != null) { collectionsList = StrUtils.splitSmart(alias, ",", true); corename = collectionsList.get(0); } } } core = cores.getCore(corename); if (core != null) { path = path.substring( idx ); } } if (core == null) { if (!cores.isZooKeeperAware() ) { core = cores.getCore(""); } } } if (core == null && cores.isZooKeeperAware()) { // we couldn't find the core - lets make sure a collection was not specified instead core = getCoreByCollection(cores, corename, path); if (core != null) { // we found a core, update the path path = path.substring( idx ); } // if we couldn't find it locally, look on other nodes if (core == null && idx > 0) { String coreUrl = getRemotCoreUrl(cores, corename, origCorename); // don't proxy for internal update requests SolrParams queryParams = SolrRequestParsers.parseQueryString(req.getQueryString()); if (coreUrl != null && queryParams .get(DistributingUpdateProcessorFactory.DISTRIB_UPDATE_PARAM) == null) { path = path.substring(idx); remoteQuery(coreUrl + path, req, solrReq, resp); return; } else { if (!retry) { // we couldn't find a core to work with, try reloading aliases // TODO: it would be nice if admin ui elements skipped this... ZkStateReader reader = cores.getZkController() .getZkStateReader(); reader.updateAliases(); doFilter(request, response, chain, true); return; } } } // try the default core if (core == null) { core = cores.getCore(""); } } // With a valid core... if( core != null ) {//驗證core final SolrConfig config = core.getSolrConfig(); // get or create/cache the parser for the core SolrRequestParsers parser = config.getRequestParsers(); // Handle /schema/* and /config/* paths via Restlet if( path.equals("/schema") || path.startsWith("/schema/") || path.equals("/config") || path.startsWith("/config/")) {//solr rest api 入口 solrReq = parser.parse(core, path, req); SolrRequestInfo.setRequestInfo(new SolrRequestInfo(solrReq, new SolrQueryResponse())); if( path.equals(req.getServletPath()) ) { // avoid endless loop - pass through to Restlet via webapp chain.doFilter(request, response); } else { // forward rewritten URI (without path prefix and core/collection name) to Restlet req.getRequestDispatcher(path).forward(request, response); } return; } // Determine the handler from the url path if not set // (we might already have selected the cores handler) if( handler == null && path.length() > 1 ) { // don't match "" or "/" as valid path handler = core.getRequestHandler( path ); // no handler yet but allowed to handle select; let's check if( handler == null && parser.isHandleSelect() ) { if( "/select".equals( path ) || "/select/".equals( path ) ) {//solr 各類查詢過濾入口 solrReq = parser.parse( core, path, req ); String qt = solrReq.getParams().get( CommonParams.QT ); handler = core.getRequestHandler( qt ); if( handler == null ) { throw new SolrException( SolrException.ErrorCode.BAD_REQUEST, "unknown handler: "+qt); } if( qt != null && qt.startsWith("/") && (handler instanceof ContentStreamHandlerBase)) { //For security reasons it's a bad idea to allow a leading '/', ex: /select?qt=/update see SOLR-3161 //There was no restriction from Solr 1.4 thru 3.5 and it's not supported for update handlers. throw new SolrException( SolrException.ErrorCode.BAD_REQUEST, "Invalid Request Handler ('qt'). Do not use /select to access: "+qt); } } } } // With a valid handler and a valid core... if( handler != null ) { // if not a /select, create the request if( solrReq == null ) { solrReq = parser.parse( core, path, req ); } if (usingAliases) { processAliases(solrReq, aliases, collectionsList); } final Method reqMethod = Method.getMethod(req.getMethod()); HttpCacheHeaderUtil.setCacheControlHeader(config, resp, reqMethod); // unless we have been explicitly told not to, do cache validation // if we fail cache validation, execute the query if (config.getHttpCachingConfig().isNever304() || !HttpCacheHeaderUtil.doCacheHeaderValidation(solrReq, req, reqMethod, resp)) {//solr http 緩存 在header控制失效時間的方式 SolrQueryResponse solrRsp = new SolrQueryResponse(); /* even for HEAD requests, we need to execute the handler to * ensure we don't get an error (and to make sure the correct * QueryResponseWriter is selected and we get the correct * Content-Type) */ SolrRequestInfo.setRequestInfo(new SolrRequestInfo(solrReq, solrRsp)); this.execute( req, handler, solrReq, solrRsp ); HttpCacheHeaderUtil.checkHttpCachingVeto(solrRsp, resp, reqMethod); // add info to http headers //TODO: See SOLR-232 and SOLR-267. /*try { NamedList solrRspHeader = solrRsp.getResponseHeader(); for (int i=0; i<solrRspHeader.size(); i++) { ((javax.servlet.http.HttpServletResponse) response).addHeader(("Solr-" + solrRspHeader.getName(i)), String.valueOf(solrRspHeader.getVal(i))); } } catch (ClassCastException cce) { log.log(Level.WARNING, "exception adding response header log information", cce); }*/ QueryResponseWriter responseWriter = core.getQueryResponseWriter(solrReq); writeResponse(solrRsp, response, responseWriter, solrReq, reqMethod); } return; // we are done with a valid handler } } log.debug("no handler or core retrieved for " + path + ", follow through..."); } catch (Throwable ex) { sendError( core, solrReq, request, (HttpServletResponse)response, ex ); if (ex instanceof Error) { throw (Error) ex; } return; } finally { try { if (solrReq != null) { log.debug("Closing out SolrRequest: {}", solrReq); solrReq.close(); } } finally { try { if (core != null) { core.close(); } } finally { SolrRequestInfo.clearRequestInfo(); } } } } // Otherwise let the webapp handle the request chain.doFilter(request, response); }
文章轉載請註明出處:http://www.cnblogs.com/likehua/p/4353608.html