-
Notifications
You must be signed in to change notification settings - Fork 0
ORNL-TechInt/esg-octopus
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
ESG Octopus Development Notes
What's here:
README
This file. It is in git.
esg_octopus.py
The base script that implements the octopus beak. To create the
"esg_octopus" symlink, run "./esg_octopus.py --makelink".
esg_octopus.py is in git. The symlink (esg_octopus) is not.
esg_octopus.cfg
The configuration file for esg_octopus. Not in git.
esg_octopus.cfg.example
An example configuration file that can be copied to
esg_octopus.cfg. This is in git.
data/
A copy of the THREDDS catalog from ORNL's production data node
esg2-sdn1. This is not checked into git.
java/
The .jar files needed for plugins and plugouts. These are not in
git. The setup/install script (not yet written) will download
these at install time.
plugins
Where input format plugins reside. Each plugin has the structure
described later in this file. The netcdf example plugin is in git.
plugouts
Where output format plugins reside. Each plugout has the
structure described later in this file. The thredds example
plugout is in git.
What Needs to be Done
- Finish characterizing the contents of our current THREDDS catalog
for guidance on what needs to go into THREDDS catalogs used in ESG.
This will likely need to be broadened in the future. We have to
start somewhere.
- Figure out how to build a catalog in memory using the
ThreddsXmlWriter and related classes.
- Identify the calls the thredds plugout will make to gather the
information it needs to construct the catalog. The thredds plugout
will call up to the beak. The beak will forward these calls to the
netcdf plugin.
- Write the routines in the beak layer that will receive the calls
from the thredds plugout.
- Work out how to incorporate the cdms2 python module into
esg_octopus. Is this another piece the install script has to
download at install time? This will be needed to complete the
netcdf routines.
- Write the routines in the netcdf plugin that will receive the calls
forwarded by the beak.
Plugin/Plugout Layout
Each plugin or plugout has its own subdirectory in the respective
PluginDir or PlugoutDir, as defined in the configuration file.
At a minimum, the plug{in,out} directory contains a file named
__init__.py which is called when the plug{in,out} is imported.
The subdirectory name is the string the user will enter on the
command line as the argument for options --input_format or
--output_format to use the named plug{in,out}.
The current publication sequence:
esgscan_directory --dataset <dataset-name> --project <project-name> \
<pathtofiles> > <mapfile>
# <mapfile> contains "dataset map" where each line contains:
dataset_id | absolute_file_path | size
# <dataset-name> and <project-name> have to be defined in esg.ini
esgpublish --map <mapfile> --project <project-name>
esgpublish --use-existing <dataset-name> --noscan --thredds
esgpublish --use-existing <dataset-name> --noscan --publish
(requires globus certificate)
Current THREDDS Catalog Characterizationn
Looking at THREDDS xml files copied from esg2-sdn1.
All files begin with the line
<?xml version='1.0' encoding='UTF-8'?>
This is followed in each file by a single <catalog> ... </catalog>
element which occupies the rest of the file. It appears that the
<catalog ...> tag always contains the same attributes (xmlns:xsi,
xmlns:xlink, xmlns, name, xsi:schemaLocation). Only the value of
the name attribute varies. For the catalog.xml file at the top
level, name is "ORNL Earth System Grid catalog". For files one
layer down, it's always "TDS configuration file".
catalog.xml: Inside the <catalog> ... </catalog> element appear:
- datasetRoot
> path
> location
- catalogRef (repeated 46 times)
> name
> xlink:title
> xlink: href
The following files are missing from catalog.xml:
ornl.arm.data.mytest.v1.xml
ornl.arm.data.rgmtest.v1.xml
ornl.ccsm.b30.041g.proc.tseries.monthly.v1.xml
ornl.cdiac.ameriflux_L2.AMF_USARb.v1.xml
ornl.cdiac.ameriflux_L2.AMF_USARc.v1.xml
ornl.cdiac.ameriflux_L2.AMF_USARM.v1.xml
ornl.cdiac.ameriflux_L2.AMF_USAud.v1.xml
ornl.cdiac.ameriflux_L2.AMF_USSP1.v1.xml
pcmdi.ornl.test.mytest.v1.xml
pcmdi.test.mytest.v1.xml
In the lower level files, <catalog> ... </catalog> contains:
- service
> OpenDAP
> HTTPServer
> SRM
- property
> catalog_version
- dataset (may be nested)
> property (multiple)
> metadata
> variables
> variable name="..."
> metadata inherited="true"
> nested datasets
In the thredds catalog, we have the following xml files. Under each
.xml files is listed the netcdf files it represents
obs4cmip5.observations.cmbe.ARM.Barrow.in-situ_stations.atmos.1hr.v1.xml
PRJ/obs4cmip5/observations/atmos/ps/1hr/in_situ_sites/usdoe/arm/cmbe/
v120110314/ps_arm_barrow_cmbe_v1p1_20010101-20101231.nc
PRJ/obs4cmip5/observations/atmos/sfc/1hr/in_situ_sites/usdoe/arm/cmbe/
v120110314/sfcWind_arm_barrow_cmbe_v1p1_20010101-20101231.nc
PRJ/obs4cmip5/observations/atmos/tas/1hr/in_situ_sites/usdoe/arm/cmbe/
v120110314/tas_arm_barrow_cmbe_v1p1_20010101-20101231.nc
PRJ/obs4cmip5/observations/atmos/uas/1hr/in_situ_sites/usdoe/arm/cmbe/
v120110314/uas_arm_barrow_cmbe_v1p1_20010101-20101231.nc
PRJ/obs4cmip5/observations/atmos/vas/1hr/in_situ_sites/usdoe/arm/cmbe/
v120110314/vas_arm_barrow_cmbe_v1p1_20010101-20101231.nc
obs4cmip5.observations.cmbe.ARM.Darwin.in-situ_stations.atmos.1hr.v1.xml
PRJ/obs4cmip5/observations/atmos/ps/1hr/in_situ_sites/usdoe/arm/cmbe/
v120110314/ps_arm_darwin_cmbe_v1p1_20020101-20101231.nc
PRJ/obs4cmip5/observations/atmos/sfc/1hr/in_situ_sites/usdoe/arm/cmbe/
v120110314/sfcWind_arm_darwin_cmbe_v1p1_20020101-20101231.nc
PRJ/obs4cmip5/observations/atmos/tas/1hr/in_situ_sites/usdoe/arm/cmbe/
v120110314/tas_arm_darwin_cmbe_v1p1_20020101-20101231.nc
PRJ/obs4cmip5/observations/atmos/uas/1hr/in_situ_sites/usdoe/arm/cmbe/
v120110314/uas_arm_darwin_cmbe_v1p1_20020101-20101231.nc
PRJ/obs4cmip5/observations/atmos/vas/1hr/in_situ_sites/usdoe/arm/cmbe/
v120110314/vas_arm_darwin_cmbe_v1p1_20020101-20101231.nc
obs4cmip5.observations.cmbe.ARM.Manus.in-situ_stations.atmos.1hr.v1.xml
PRJ/obs4cmip5/observations/atmos/ps/1hr/in_situ_sites/usdoe/arm/cmbe/
v120110314/ps_arm_manus_cmbe_v1p1_19980101-20101231.nc
PRJ/obs4cmip5/observations/atmos/sfc/1hr/in_situ_sites/usdoe/arm/cmbe/
v120110314/sfcWind_arm_manus_cmbe_v1p1_19980101-20101231.nc
PRJ/obs4cmip5/observations/atmos/tas/1hr/in_situ_sites/usdoe/arm/cmbe/
v120110314/tas_arm_manus_cmbe_v1p1_19980101-20101231.nc
PRJ/obs4cmip5/observations/atmos/uas/1hr/in_situ_sites/usdoe/arm/cmbe/
v120110314/uas_arm_manus_cmbe_v1p1_19980101-20101231.nc
PRJ/obs4cmip5/observations/atmos/vas/1hr/in_situ_sites/usdoe/arm/cmbe/
v120110314/vas_arm_manus_cmbe_v1p1_19980101-20101231.nc
obs4cmip5.observations.cmbe.ARM.Nauru.in-situ_stations.atmos.1hr.v1.xml
ornl.arm.cmbe-atm.nsac1.v1.xml
ornl.arm.cmbe-atm.sgpc1.v1.xml
ornl.arm.cmbe-atm.twpc1.v1.xml
ornl.arm.cmbe-atm.twpc2.v1.xml
ornl.arm.cmbe-atm.twpc3.v1.xml
ornl.arm.cmbe-cldrad.nsac1.v1.xml
ornl.arm.cmbe-cldrad.sgpc1.v1.xml
ornl.arm.cmbe-cldrad.twpc1.v1.xml
ornl.arm.cmbe-cldrad.twpc2.v1.xml
ornl.arm.cmbe-cldrad.twpc3.v1.xml
ornl.arm.data.mytest.v1.xml
ornl.arm.data.offline.v1.xml
ornl.arm.data.rgmtest.v1.xml
ornl.ccsm.b30.041g.proc.tseries.monthly.v1.xml
ornl.cdiac.ameriflux_L2.AMF_USARb.v1.xml
ornl.cdiac.ameriflux_L2.AMF_USARc.v1.xml
ornl.cdiac.ameriflux_L2.AMF_USARM.v1.xml
ornl.cdiac.ameriflux_L2.AMF_USAud.v1.xml
ornl.cdiac.ameriflux_L2.AMF_USSP1.v1.xml
ornl.c-lamp.exp1.clm3_casa.exp1_2.run1.monthly_mean.v1.xml
ornl.c-lamp.exp1.clm3_casa.exp1_3.run1.monthly_mean.v1.xml
ornl.c-lamp.exp1.clm3_casa.exp1_4.run1.monthly_mean.v1.xml
ornl.c-lamp.exp1.clm3_casa.exp1_6.run1.monthly_mean.v1.xml
ornl.c-lamp.exp1.clm3_casa.exp1_7.run1.monthly_mean.v1.xml
ornl.c-lamp.exp1.clm3_cn.exp1_2.run1.monthly_mean.v1.xml
ornl.c-lamp.exp1.clm3_cn.exp1_3.run1.monthly_mean.v1.xml
ornl.c-lamp.exp1.clm3_cn.exp1_4.run1.monthly_mean.v1.xml
ornl.c-lamp.exp1.clm3_cn.exp1_6.run1.monthly_mean.v1.xml
ornl.c-lamp.exp1.clm3_cn.exp1_7.run1.monthly_mean.v1.xml
ornl.c-lamp.exp1_fix.clm3_casa.exp1_2.run1.monthly_mean.v1.xml
ornl.c-lamp.exp1_fix.clm3_casa.exp1_3.run1.monthly_mean.v1.xml
ornl.c-lamp.exp1_fix.clm3_casa.exp1_4.run1.monthly_mean.v1.xml
ornl.c-lamp.exp1_fix.clm3_casa.exp1_6.run1.monthly_mean.v1.xml
ornl.c-lamp.exp1_fix.clm3_casa.exp1_7.run1.monthly_mean.v1.xml
ornl.c-lamp.exp1_fix.clm3_cn.exp1_2.run1.monthly_mean.v1.xml
ornl.c-lamp.exp1_fix.clm3_cn.exp1_3.run1.monthly_mean.v1.xml
ornl.c-lamp.exp1_fix.clm3_cn.exp1_4.run1.monthly_mean.v1.xml
ornl.c-lamp.exp1_fix.clm3_cn.exp1_6.run1.monthly_mean.v1.xml
ornl.c-lamp.exp1_fix.clm3_cn.exp1_7.run1.monthly_mean.v1.xml
ornl.c-lamp.exp2.clm3_casa.exp2_2.run1.monthly_mean.v1.xml
ornl.c-lamp.exp2.clm3_casa.exp2_3.run1.monthly_mean.v1.xml
ornl.c-lamp.exp2.clm3_casa.exp2_4.run1.monthly_mean.v1.xml
ornl.c-lamp.exp2.clm3_casa.exp2_6.run1.monthly_mean.v1.xml
ornl.c-lamp.exp2.clm3_cn.exp2_2.run1.monthly_mean.v1.xml
ornl.c-lamp.exp2.clm3_cn.exp2_3.run1.monthly_mean.v1.xml
ornl.c-lamp.exp2.clm3_cn.exp2_4.run1.monthly_mean.v1.xml
ornl.c-lamp.exp2.clm3_cn.exp2_6.run1.monthly_mean.v1.xml
ornl.ultrahighres.CESM1.T85F09_12YRS.F1850.v1.xml
ornl.ultrahighres.CESM1.T85F09_5YRS.B1850.v1.xml
ornl.ultrahighres.CESM1.T85F09.B1850.v1.xml
pcmdi.ornl.test.mytest.v1.xml
pcmdi.test.mytest.v1.xml
About
Generalized metadata extractor
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published