{"id":1005,"date":"2019-02-04T08:12:02","date_gmt":"2019-02-03T23:12:02","guid":{"rendered":"https:\/\/oboki.net\/workspace\/?p=1005"},"modified":"2020-01-05T14:41:33","modified_gmt":"2020-01-05T05:41:33","slug":"scroll-api","status":"publish","type":"post","link":"https:\/\/oboki.net\/workspace\/data-engineering\/elasticsearch\/scroll-api\/","title":{"rendered":"[ElasticSearch] Scroll API"},"content":{"rendered":"<h1>[ElasticSearch] Scroll API<\/h1>\n<p>elasticsearch\uc5d0\uc11c \uae30\ubcf8\uc801\uc778 search API\ub294 \ud55c \ud398\uc774\uc9c0\ub97c \ub9ac\ud134\ud558\uace0\ub098\uba74 <code>search context<\/code>\uac00 \uc18c\uba78\ub41c\ub2e4. search API\uc5d0\uc11c <code>from<\/code>, <code>to<\/code> \uac12\uc744 \uc774\uc6a9\ud574 pagination\uc744 \uad6c\ud604\ud55c \uac83\uc740 \ubcc4\uac1c\uc758 \ucffc\ub9ac\uac00 \ub9e4\ubc88 \uc218\ud589\ub41c \uac83\uc73c\ub85c RDBMS\ub85c \uce58\uba74 cursor\uac00 \uc18c\uba78\ub41c \uac83\uacfc \uac19\ub2e4.<\/p>\n<blockquote>\n<p>\ub300\ub7c9\uc758 \ub370\uc774\ud130\ub97c \uc870\ud68c\ud558\uae30 \uc704\ud574\uc11c\ub294 \ub2e4\ub978 \ubc29\uc2dd\uc744 \uc0ac\uc6a9\ud574\uc57c\ud558\ub294\ub370, Elasticsearch\uc5d0\uc11c RDBMS\uc758 cursor\uc640 \uac19\uc740 \uae30\ub2a5\uc744 \ud558\ub294\uac8c Scroll API \uc774\ub2e4.<\/p>\n<\/blockquote>\n<h2>\uc0ac\uc6a9 \ubc29\ubc95<\/h2>\n<p>curl\uc744 \uc774\uc6a9\ud574 \uac04\ub2e8\ud788 <code>scroll API<\/code>\ub97c \uc0ac\uc6a9\ud574\ubcf4\uba74,<\/p>\n<p>\uba3c\uc800 \uae30\ubcf8\uc801\uc778 search \ucffc\ub9ac\uc5d0 scroll argument\ub9cc \ucd94\uac00\ud574\uc11c \ucffc\ub9ac\ud558\uba74 \ub41c\ub2e4. scroll \uac12\uc774 \ucd94\uac00\ub418\uac8c \ub418\uba74 elasticsearch\ub294 \uc55e\uc73c\ub85c \uc9c0\uc815\ud55c \uc2dc\uac04\ub9cc\ud07c\uc740 \uc9c0\uae08 \uc218\ud589\ud55c \ucffc\ub9ac\uc758 search context\ub97c \uc720\uc9c0\uc2dc\ucf1c\uc900\ub2e4.<\/p>\n<p><code>issue-v0.1.3<\/code> \uc778\ub371\uc2a4\uc5d0 \uc800\uc7a5\ub41c \ubaa8\ub4e0 Issue Number \uac12\uc744 \uc870\ud68c\ud55c\ub2e4\uace0 \ud588\uc744 \ub54c \ub2e4\uc74c\uacfc \uac19\uc774 \uae30\ubcf8\uc801\uc778 <code>match_all<\/code> \ucffc\ub9ac\uc5d0 <code>scroll=10m<\/code> \ud30c\ub77c\ubbf8\ud130\ub9cc \ucd94\uac00\ud574\uc11c \uc694\uccad\ud55c\ub2e4.<\/p>\n<pre><code class=\"language-bash\">curl -XPOST \\\n-H &#039;Content-Type: application\/json&#039; \\\nlocalhost:9200\/issue-v0.1.3\/_search?scroll=10m -d&#039;\n{\n    _source:[Issue Number],\n    size: 100,\n    query: {\n        match_all : {}\n    }\n}\n&#039;<\/code><\/pre>\n<p>\uadf8 \uacb0\uacfc \ub2e4\uc74c\uacfc \uac19\uc774 \uc6d0\ub798 Search API \uc5d0\uc11c \uc608\uc0c1\ub410\ub358 \uc870\ud68c \uacb0\uacfc\uc640 \ud568\uaed8 <code>_scroll_id<\/code> \ub77c\ub294 \uac12\ub3c4 \uc804\ub2ec\ubc1b\uc744\uc218\uac00 \uc788\ub294\ub370 \uc774 id \uac00 \uc55e\uc73c\ub85c \uc9c0\uc815\ud55c \uc2dc\uac04 (\uc5ec\uae30\uc5d0\uc11c\ub294 10m) \ub3d9\uc548 \uc694\uccad\ud55c \ucffc\ub9ac\uc758 search context \ub97c \uac00\ub9ac\ud0a8\ub2e4.<\/p>\n<pre><code class=\"language-json\">{_scroll_id:DnF1ZXJ5VGhlbkZldGNoAgAAAAAAAGLvFlQ2ZldFT3NvUzhha1ZrMm5RZjlITHcAAAAAAABi8BZUNmZXRU9zb1M4YWtWazJuUWY5SEx3,took:16,timed_out:false,_shards:{total:2,successful:2,skipped:0,failed:0},hits:{total:187456,max_score:1.0,hits:[{_index:issue-v0.1.3,_type:doc,_id:131544,_score:1.0,_source:{Issue Number:131544}},{_index:issue-v0.1.3,_type:doc,_id:184522,_score:1.0,_source:{Issue Number:184522}},...,{_index:issue-v0.1.3,_type:doc,_id:8953,_score:1.0,_source:{Issue Number:8953}},{_index:issue-v0.1.3,_type:doc,_id:8955,_score:1.0,_source:{Issue Number:8955}}]}}<\/code><\/pre>\n<p><\/p>\n<p>\uc704 \ucffc\ub9ac\uc5d0\uc11c <code>&quot;size&quot;: 100<\/code> \uc73c\ub85c 100 \uac74 \uc529 fetch \ud558\ub3c4\ub85d \uc124\uc815\ud588\uae30 \ub54c\ubb38\uc5d0 \uc55e\uc120 \uacb0\uacfc\uc5d0\uc11c\ub294 100\uac74\ub9cc \ud655\uc778\ud560 \uc218 \uc788\uc5c8\ub2e4. \uc774\uc81c\ubd80\ud130\ub294 \uccab \uc694\uccad\uc5d0\uc11c \uc804\ub2ec\ubc1b\uc740 <code>scroll_id<\/code> \uac12\ub9cc elasticsearch\uc5d0 \uc804\ub2ec\ud574\uc8fc\uba74 \uc544\uc9c1 \uc804\ub2ec\ub418\uc9c0 \uc54a\uc740 \ub370\uc774\ud130\ub4e4\uc744 100\uac74\uc529 \uc774\uc5b4\uc11c fetch \ubc1b\uc744 \uc218 \uc788\ub2e4.<\/p>\n<pre><code class=\"language-bash\">curl -XPOST \\\n-H &#039;Content-Type: application\/json&#039; \\\nlocalhost:9200\/_search\/scroll -d&#039;\n{\n    scroll : 10m, \n    scroll_id : DnF1ZXJ5VGhlbkZldGNoAgAAAAAAAGLvFlQ2ZldFT3NvUzhha1ZrMm5RZjlITHcAAAAAAABi8BZUNmZXRU9zb1M4YWtWazJuUWY5SEx3 \n}\n&#039;<\/code><\/pre>\n<p>\uae30\ubcf8\uc801\uc73c\ub85c search context\ub294 \uc9c0\uc815\ud55c \uc2dc\uac04\uc774 \uc9c0\ub098\uba74 \uc18c\uba78\ub418\ub294\ub370, \uc804\uccb4 \ub370\uc774\ud130 fetch\uac00 \uc0dd\uac01\ubcf4\ub2e4 \ube68\ub9ac \ub05d\ub098\ub294 \uacbd\uc6b0\uac00 \uc788\uc744 \uc218 \uc788\uc73c\ubbc0\ub85c \ub2e4\uc74c\uacfc \uac19\uc774 \uba85\uc2dc\uc801\uc73c\ub85c \ud574\ub2f9 context\ub97c \uc81c\uac70\ud574\uc904 \uc218 \uc788\ub2e4. search context\uac00 \uc720\uc9c0\ub418\ub294 \ub3d9\uc548\uc740 elasticsearch \uc11c\ubc84\uc5d0\uc11c \uba54\ubaa8\ub9ac\ub97c \uc810\uc720\ud558\uace0 \uc788\uae30 \ub54c\ubb38\uc5d0 \uba85\uc2dc\uc801\uc73c\ub85c \uc81c\uac70\ud574\uc8fc\ub294 \uac83\uc774 \uc88b\ub2e4.<\/p>\n<pre><code class=\"language-bash\">curl -XDELETE \\\n-H &#039;Content-Type: application\/json&#039; \\\nlocalhost:9200\/_search\/scroll -d&#039;\n{\n    scroll_id : DnF1ZXJ5VGhlbkZldGNoAgAAAAAAAGLvFlQ2ZldFT3NvUzhha1ZrMm5RZjlITHcAAAAAAABi8BZUNmZXRU9zb1M4YWtWazJuUWY5SEx3\n}\n&#039;<\/code><\/pre>\n<h2>Python Sample<\/h2>\n<p>search API\ub97c \uba3c\uc800 \ud638\ucd9c\ud560 \ub54c scroll \uc18d\uc131\uc744 \ucd94\uac00\ud574\uc8fc\uace0, \uc774\uc5b4\uc11c \ub098\uba38\uc9c0 \ubd80\ubd84\uc744 \uc870\ud68c\ud560 \ub54c\uc5d0\ub294 Elasticsearch.scoll \uba54\uc18c\ub4dc\ub97c \uc774\uc6a9\ud55c\ub2e4. \uc0ac\uc6a9 \ubc29\ubc95\uc740 \uc55e\uc120 curl \uc758 \ubc29\ubc95\uacfc \uac19\ub2e4.<\/p>\n<pre><code class=\"language-python\">from elasticsearch import Elasticsearch\n_KEEP_ALIVE_LIMIT=&#039;30s&#039;\nbody = { \n  _source:[Issue Number],\n  query : { \n    match_all:{}\n  }\n}\n\nes_client = Elasticsearch([localhost:9200],timeout=300)\nresponse = es_client.search(\n  index = &#039;issue-v0.1.3&#039;,\n  doc_type = &#039;doc&#039;,\n  scroll = _KEEP_ALIVE_LIMIT,\n  size = 100,\n  body = body\n  )\n\nsid = response[&#039;_scroll_id&#039;]\nfetched = len(response[&#039;hits&#039;][&#039;hits&#039;])\n\nnums = []\nfor i in range(fetched):\n  nums.append(int(response[&#039;hits&#039;][&#039;hits&#039;][i][&#039;_source&#039;][&#039;Issue Number&#039;]))\n\nwhile(fetched&gt;0):\n  response = es_client.scroll(scroll_id=sid, scroll=_KEEP_ALIVE_LIMIT)\n  fetched = len(response[&#039;hits&#039;][&#039;hits&#039;])\n  for i in range(fetched):\n    nums.append(int(response[&#039;hits&#039;][&#039;hits&#039;][i][&#039;_source&#039;][&#039;Issue Number&#039;]))\n\nprint(nums)<\/code><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>[ElasticSearch] Scroll API elasticsearch\uc5d0\uc11c \uae30\ubcf8\uc801\uc778 search API\ub294 \ud55c \ud398\uc774\uc9c0\ub97c \ub9ac\ud134\ud558\uace0\ub098\uba74 search context\uac00 \uc18c\uba78\ub41c\ub2e4. search API\uc5d0\uc11c from, to \uac12\uc744 \uc774\uc6a9\ud574 pagination\uc744 \uad6c\ud604\ud55c \uac83\uc740 \ubcc4\uac1c\uc758 \ucffc\ub9ac\uac00 \ub9e4\ubc88 \uc218\ud589\ub41c \uac83\uc73c\ub85c RDBMS\ub85c \uce58\uba74 cursor\uac00 \uc18c\uba78\ub41c \uac83\uacfc \uac19\ub2e4. \ub300\ub7c9\uc758 \ub370\uc774\ud130\ub97c \uc870\ud68c\ud558\uae30 \uc704\ud574\uc11c\ub294 \ub2e4\ub978 \ubc29\uc2dd\uc744 \uc0ac\uc6a9\ud574\uc57c\ud558\ub294\ub370, Elasticsearch\uc5d0\uc11c RDBMS\uc758 cursor\uc640 \uac19\uc740 \uae30\ub2a5\uc744 \ud558\ub294\uac8c Scroll API \uc774\ub2e4. \uc0ac\uc6a9 \ubc29\ubc95 curl\uc744 \uc774\uc6a9\ud574 \uac04\ub2e8\ud788 scroll [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[12],"tags":[25,105],"class_list":["post-1005","post","type-post","status-publish","format-standard","hentry","category-elasticsearch","tag-elasticsearch","tag-scroll"],"_links":{"self":[{"href":"https:\/\/oboki.net\/workspace\/wp-json\/wp\/v2\/posts\/1005","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/oboki.net\/workspace\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/oboki.net\/workspace\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/oboki.net\/workspace\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/oboki.net\/workspace\/wp-json\/wp\/v2\/comments?post=1005"}],"version-history":[{"count":8,"href":"https:\/\/oboki.net\/workspace\/wp-json\/wp\/v2\/posts\/1005\/revisions"}],"predecessor-version":[{"id":1476,"href":"https:\/\/oboki.net\/workspace\/wp-json\/wp\/v2\/posts\/1005\/revisions\/1476"}],"wp:attachment":[{"href":"https:\/\/oboki.net\/workspace\/wp-json\/wp\/v2\/media?parent=1005"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/oboki.net\/workspace\/wp-json\/wp\/v2\/categories?post=1005"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/oboki.net\/workspace\/wp-json\/wp\/v2\/tags?post=1005"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}