|
2 | 2 | "cells": [ |
3 | 3 | { |
4 | 4 | "cell_type": "markdown", |
5 | | - "id": "liquid-finland", |
| 5 | + "id": "sticky-exhibit", |
6 | 6 | "metadata": {}, |
7 | 7 | "source": [ |
8 | | - "# demo\n", |
| 8 | + "# Demo\n", |
9 | 9 | "\n", |
10 | 10 | "Author: Cindy Chiao\n", |
11 | 11 | "Last Modified: Nov 16, 2021\n", |
|
21 | 21 | { |
22 | 22 | "cell_type": "code", |
23 | 23 | "execution_count": 1, |
24 | | - "id": "guided-johnston", |
| 24 | + "id": "banner-importance", |
25 | 25 | "metadata": {}, |
26 | 26 | "outputs": [], |
27 | 27 | "source": [ |
|
32 | 32 | }, |
33 | 33 | { |
34 | 34 | "cell_type": "markdown", |
35 | | - "id": "incorporated-november", |
| 35 | + "id": "equipped-sense", |
36 | 36 | "metadata": {}, |
37 | 37 | "source": [ |
38 | 38 | "## Example data\n", |
|
43 | 43 | { |
44 | 44 | "cell_type": "code", |
45 | 45 | "execution_count": 2, |
46 | | - "id": "substantial-puzzle", |
| 46 | + "id": "dutch-grave", |
47 | 47 | "metadata": {}, |
48 | 48 | "outputs": [ |
49 | 49 | { |
|
566 | 566 | { |
567 | 567 | "cell_type": "code", |
568 | 568 | "execution_count": 3, |
569 | | - "id": "imported-circus", |
| 569 | + "id": "applicable-diesel", |
570 | 570 | "metadata": {}, |
571 | 571 | "outputs": [ |
572 | 572 | { |
|
599 | 599 | }, |
600 | 600 | { |
601 | 601 | "cell_type": "markdown", |
602 | | - "id": "portuguese-decline", |
| 602 | + "id": "animated-marsh", |
603 | 603 | "metadata": {}, |
604 | 604 | "source": [ |
605 | 605 | "## Batch generation\n", |
|
614 | 614 | { |
615 | 615 | "cell_type": "code", |
616 | 616 | "execution_count": 4, |
617 | | - "id": "rocky-dealer", |
| 617 | + "id": "attempted-cooling", |
618 | 618 | "metadata": {}, |
619 | 619 | "outputs": [ |
620 | 620 | { |
|
1040 | 1040 | }, |
1041 | 1041 | { |
1042 | 1042 | "cell_type": "markdown", |
1043 | | - "id": "smoking-acrobat", |
| 1043 | + "id": "digital-night", |
1044 | 1044 | "metadata": {}, |
1045 | 1045 | "source": [ |
1046 | 1046 | "We can verify that the outputs have the expected shapes. \n", |
|
1051 | 1051 | { |
1052 | 1052 | "cell_type": "code", |
1053 | 1053 | "execution_count": 5, |
1054 | | - "id": "looking-journalism", |
| 1054 | + "id": "integral-theta", |
1055 | 1055 | "metadata": {}, |
1056 | 1056 | "outputs": [ |
1057 | 1057 | { |
|
1069 | 1069 | }, |
1070 | 1070 | { |
1071 | 1071 | "cell_type": "markdown", |
1072 | | - "id": "cellular-designer", |
| 1072 | + "id": "usual-kennedy", |
1073 | 1073 | "metadata": {}, |
1074 | 1074 | "source": [ |
1075 | 1075 | "There are 145 lat points and 192 lon points, thus we're expecting 145 * 192 = 27840 samples in a batch." |
|
1078 | 1078 | { |
1079 | 1079 | "cell_type": "code", |
1080 | 1080 | "execution_count": 6, |
1081 | | - "id": "accurate-arthur", |
| 1081 | + "id": "incomplete-native", |
1082 | 1082 | "metadata": {}, |
1083 | 1083 | "outputs": [ |
1084 | 1084 | { |
|
1096 | 1096 | }, |
1097 | 1097 | { |
1098 | 1098 | "cell_type": "markdown", |
1099 | | - "id": "fewer-transfer", |
| 1099 | + "id": "durable-gazette", |
1100 | 1100 | "metadata": {}, |
1101 | 1101 | "source": [ |
1102 | 1102 | "## Controlling the size/shape of batches\n", |
|
1107 | 1107 | { |
1108 | 1108 | "cell_type": "code", |
1109 | 1109 | "execution_count": 7, |
1110 | | - "id": "charming-drive", |
| 1110 | + "id": "sophisticated-legislation", |
1111 | 1111 | "metadata": {}, |
1112 | 1112 | "outputs": [ |
1113 | 1113 | { |
|
1559 | 1559 | }, |
1560 | 1560 | { |
1561 | 1561 | "cell_type": "markdown", |
1562 | | - "id": "specialized-realtor", |
| 1562 | + "id": "spectacular-reading", |
1563 | 1563 | "metadata": {}, |
1564 | 1564 | "source": [ |
1565 | 1565 | "## Last batch behavior\n", |
|
1570 | 1570 | { |
1571 | 1571 | "cell_type": "code", |
1572 | 1572 | "execution_count": 8, |
1573 | | - "id": "broadband-solid", |
| 1573 | + "id": "residential-income", |
1574 | 1574 | "metadata": {}, |
1575 | 1575 | "outputs": [ |
1576 | 1576 | { |
|
2005 | 2005 | }, |
2006 | 2006 | { |
2007 | 2007 | "cell_type": "markdown", |
2008 | | - "id": "boring-slide", |
| 2008 | + "id": "competitive-islam", |
2009 | 2009 | "metadata": {}, |
2010 | 2010 | "source": [ |
2011 | 2011 | "## Overlapping inputs\n", |
|
2017 | 2017 | { |
2018 | 2018 | "cell_type": "code", |
2019 | 2019 | "execution_count": 9, |
2020 | | - "id": "fossil-wonder", |
| 2020 | + "id": "cleared-custody", |
2021 | 2021 | "metadata": {}, |
2022 | 2022 | "outputs": [ |
2023 | 2023 | { |
|
2473 | 2473 | }, |
2474 | 2474 | { |
2475 | 2475 | "cell_type": "markdown", |
2476 | | - "id": "direct-mason", |
| 2476 | + "id": "harmful-benefit", |
2477 | 2477 | "metadata": {}, |
2478 | 2478 | "source": [ |
2479 | 2479 | "We can inspect the samples in a batch for a lat/lon pixel, noting that the overlap only applies within a batch and not across. Thus, within the 20 time points in a batch, we can get 11 samples each with 10 time points and 9 time points allowed to overlap." |
|
2482 | 2482 | { |
2483 | 2483 | "cell_type": "code", |
2484 | 2484 | "execution_count": 10, |
2485 | | - "id": "instructional-criticism", |
| 2485 | + "id": "earlier-warehouse", |
2486 | 2486 | "metadata": {}, |
2487 | 2487 | "outputs": [ |
2488 | 2488 | { |
|
2944 | 2944 | }, |
2945 | 2945 | { |
2946 | 2946 | "cell_type": "markdown", |
2947 | | - "id": "ranging-cologne", |
| 2947 | + "id": "arranged-telephone", |
2948 | 2948 | "metadata": {}, |
2949 | 2949 | "source": [ |
2950 | 2950 | "## Example applications\n", |
|
2957 | 2957 | { |
2958 | 2958 | "cell_type": "code", |
2959 | 2959 | "execution_count": 11, |
2960 | | - "id": "premier-syria", |
| 2960 | + "id": "consolidated-chocolate", |
2961 | 2961 | "metadata": {}, |
2962 | 2962 | "outputs": [ |
2963 | 2963 | { |
|
3005 | 3005 | }, |
3006 | 3006 | { |
3007 | 3007 | "cell_type": "markdown", |
3008 | | - "id": "raised-breakfast", |
| 3008 | + "id": "legislative-closer", |
3009 | 3009 | "metadata": {}, |
3010 | 3010 | "source": [ |
3011 | 3011 | "We can also use the Xarray's \"stack\" method to transform these into 2D inputs (n_samples, n_features) suitable for other machine learning algorithms implemented in libraries such as [sklearn](https://scikit-learn.org/stable/) and [xgboost](https://xgboost.readthedocs.io/en/stable/). In this case, we are expecting 9 x 9 x 9 = 729 features total." |
|
3014 | 3014 | { |
3015 | 3015 | "cell_type": "code", |
3016 | 3016 | "execution_count": 12, |
3017 | | - "id": "protecting-aside", |
| 3017 | + "id": "advisory-chicken", |
3018 | 3018 | "metadata": {}, |
3019 | 3019 | "outputs": [ |
3020 | 3020 | { |
|
3055 | 3055 | }, |
3056 | 3056 | { |
3057 | 3057 | "cell_type": "markdown", |
3058 | | - "id": "vocal-roots", |
| 3058 | + "id": "persistent-culture", |
3059 | 3059 | "metadata": {}, |
3060 | 3060 | "source": [ |
3061 | 3061 | "## What's next?\n", |
|
0 commit comments