(十四）Exploring Your Data

時間 2019-12-05

標籤十四 exploring data 简体版

原文原文鏈接

Sample Dataset

Now that we’ve gotten a glimpse of the basics, let’s try to work on a more realistic dataset. I’ve prepared a sample of fictitious JSON documents of customer bank account information. Each document has the following schema:git

如今咱們已經瞭解了基礎知識，讓咱們嘗試更真實的數據集。我準備了一份關於客戶銀行帳戶信息的虛構JSON文檔樣本。每一個文檔都有如下架構：

{
    "account_number": 0,
    "balance": 16623,
    "firstname": "Bradshaw",
    "lastname": "Mckenzie",
    "age": 29,
    "gender": "F",
    "address": "244 Columbus Place",
    "employer": "Euron",
    "email": "bradshawmckenzie@euron.com",
    "city": "Hobucken",
    "state": "CO"
}

For the curious, this data was generated using www.json-generator.com/, so please ignore the actual values and semantics of the data as these are all randomly generated.github

奇怪的是，這些數據是使用www.json-generator.com/生成的，所以請忽略數據的實際值和語義，由於這些都是隨機生成的。

Loading the Sample Dataset

You can download the sample dataset (accounts.json) from here. Extract it to our current directory and let’s load it into our cluster as follows:json

您能夠今後處下載示例數據集（accounts.json）。將它解壓縮到咱們當前的目錄，而後將它們加載到咱們的集羣中，以下所示：

curl -H "Content-Type: application/json" -XPOST "localhost:9200/bank/_doc/_bulk?pretty&refresh" --data-binary "@accounts.json"
curl "localhost:9200/_cat/indices?v"

And the response:架構

health status index uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   bank  l7sSYV2cQXmu6_4rJWVIww   5   1       1000            0    128.6kb        128.6kb

Which means that we just successfully bulk indexed 1000 documents into the bank index (under the _doc type).app

這意味着咱們只是成功地將1000個文檔批量索引到銀行索引（在_doc類型下）。

相關標籤/搜索

每日一句

每一个你不满意的现在，都有一个你没有努力的曾经。