MacOS(10.13.1) + Python(3.6.1)python
brew install geckodriver
==注:在採集動態網頁時,須要藉助外部瀏覽器Firfox(版本:57.0.1)時,對DOM的操做須要經過geckodriver類庫。在網上查找半天有的說是在Firfox某個版本以前是不須要geckodriver包,具體記不太清了。==web
從新下載tornado,下載地址:https://pypi.python.org/pypi/tornado 一、下載命令:wget https://pypi.python.org/packages/df/42/a180ee540e12e2ec1007ac82a42b09dd92e5461e09c98bf465e98646d187/tornado-4.5.1.tar.gz#md5=838687d20923360af5ab59f101e9e02e 二、解壓:tar -zxvf tornado-4.5.1.tar.gz 三、cd tornado-4.5.1 四、python setup.py build 五、python setup.py install
lst_news = self.driver.find_elements_by_xpath('//ul[@class="sameday_list"]/li') for_i = 0 for item in lst_news: # 第一種方案 li_id = item.get_attribute('id') title = item.find_element_by_xpath('//li[@id="'+li_id+'"]/div/h2/span[@class="title"]').text print(title) # 第二種方案 for_i += 1 title2 = item.find_element_by_xpath('//li['+str(for_i)+']/div/h2/span[@class="title"]').text print(title2)
==問題緣由:如下爲內部實現方法,注意return是在父dom結構下去執行xpath。因此致使永遠取到的是第一個值。因此咱們在for中須要指定li標籤的值==瀏覽器
# Private Methods def _execute(self, command, params=None): """Executes a command against the underlying HTML element. Args: command: The name of the command to _execute as a string. params: A dictionary of named parameters to send with the command. Returns: The command's JSON response loaded into a dictionary object. """ if not params: params = {} params['id'] = self._id return self._parent.execute(command, params)