netCDF 4.2.1.1
DAP Support

Beginning with netCDF version 4.1, optional support is provided for accessing data through OPeNDAP servers using the DAP protocol.

DAP support is automatically enabled if a usable curl library can be located using the curl-config program. DAP support can forcibly be enabled or disabled using the –enable-dap flag or the –disable-dap flag, respectively. If enabled, then DAP support requires access to the curl library. Refer to the installation manual for details: The NetCDF Installation and Porting Guide.

DAP uses a data model that is different from that supported by netCDF, either classic or enhanced. Generically, the DAP data model is encoded textually in a DDS (Dataset Descriptor Structure). There is a second data model for DAP attributes, which is encoded textually in a DAS (Dataset Attribute Structure). For detailed information about the DAP DDS and DAS, refer to the OPeNDAP web site http://opendap.org.

OPeNDAP Data

In order to access an OPeNDAP data source through the netCDF API, the file name normally used is replaced with a URL with a specific format. The URL is composed of four parts.

It is possible to see what the translation does to a particular DAP data source in either of two ways. First, one can examine the DDS source through a web browser and then examine the translation using the ncdump -h command to see the netCDF Classic translation. The ncdump output will actually be the union of the DDS with the DAS, so to see the complete translation, it is necessary to view both.

For example, if a web browser is given the following, the first URL will return the DDS for the specified dataset, and the second URL will return the DAS for the specified dataset.

     http://test.opendap.org:8080/dods/dts/test.01.dds
     http://test.opendap.org:8080/dods/dts/test.01.das

Then by using the following ncdump command, it is possible to see the equivalent netCDF Classic translation.

     ncdump -h http://test.opendap.org:8080/dods/dts/test.01

The DDS output from the web server should look like this.

Dataset {
    Byte b;
    Int32 i32;
    UInt32 ui32;
    Int16 i16;
    UInt16 ui16;
    Float32 f32;
    Float64 f64;
    String s;
    Url u;
} SimpleTypes;

The DAS output from the web server should look like this.

Attributes {
    Facility {
        String PrincipleInvestigator ``Mark Abbott'', ``Ph.D'';
        String DataCenter ``COAS Environmental Computer Facility'';
        String DrifterType ``MetOcean WOCE/OCM'';
    }
    b {
        String Description ``A test byte'';
        String units ``unknown'';
    }
    i32 {
        String Description ``A 32 bit test server int'';
        String units ``unknown'';
    }
}

The output from ncdump should look like this.

netcdf test {
dimensions:
        stringdim64 = 64 ;
variables:
        byte b ;
                b:Description = "A test byte" ;
                b:units = "unknown" ;
        int i32 ;
                i32:Description = "A 32 bit test server int" ;
                i32:units = "unknown" ;
        int ui32 ;
        short i16 ;
        short ui16 ;
        float f32 ;
        double f64 ;
        char s(stringdim64) ;
        char u(stringdim64) ;
}

Note that the fields of type String and type URL have suddenly acquired a dimension. This is because strings are translated to arrays of char, which requires adding an extra dimension. The size of the dimension is determined in a variety of ways and can be specified. It defaults to 64 and when read, the underlying string is either padded or truncated to that length.

Also note that the Facility attributes do not appear in the translation because they are neither global nor associated with a variable in the DDS.

Alternately, one can get the text of the DDS as a global attribute by using the client parameters mechanism . In this case, the parameter “[show=dds]” can be prefixed to the URL and the data retrieved using the following command

     ncdump -h [show=dds]http://test.opendap.org:8080/dods/dts/test.01.dds

The ncdump -h command will then show both the translation and the original DDS. In the above example, the DDS would appear as the global attribute “_DDS” as follows.

netcdf test {
...
variables:
        :_DDS = "Dataset { Byte b; Int32 i32; UInt32 ui32; Int16 i16;
                 UInt16 ui16; Float32 f32; Float64 f64;
                 Strings; Url u; } SimpleTypes;"

        byte b ;
...
}

DAP to NetCDF Translation Rules

Two translations are currently available.

DAP 2 Protocol to netCDF-3 DAP 2 Protocol to netCDF-4

Translation Rules

The current default translation code translates the OPeNDAP protocol to netCDF-3 (classic). This netCDF-3 translation converts an OPeNDAP DAP protocol version 2 DDS to netCDF-3 and is designed to mimic as closely as possible the translation provided by the libnc-dap system. In addition, a translation to netCDF-4 (enhanced) is provided that is entirely new.

For illustrative purposes, the following example will be used.

Dataset {
  Int32 f1;
  Structure {
    Int32 f11;        
    Structure {
      Int32 f1[3];
      Int32 f2;
    } FS2[2]; 
  } S1; 
  Structure {
    Grid {
      Array:
        Float32 temp[lat=2][lon=2];
      Maps:
        Int32 lat[lat=2];
        Int32 lon[lon=2];
    } G1;
  } S2;
  Grid {
      Array:
        Float32 G2[lat=2][lon=2];
      Maps:
        Int32 lat[2];
        Int32 lon[2];
  } G2;
  Int32 lat[lat=2];
  Int32 lon[lon=2];
} D1;
\code

\subsection Variable Definition

The set of netCDF variables is derived from the fields with primitive
base types as they occur in Sequences, Grids, and Structures. The
field names are modified to be fully qualified initially. For the
above, the set of variables are as follows. The coordinate variables
within grids are left out in order to mimic the behavior of libnc-dap.

\code
    f1
    S1.f11
    S1.FS2.f1
    S1.FS2.f2
    S2.G1.temp
    S2.G2.G2
    lat
    lon 

Dimension Translation

A variable's rank is determined from three sources.

If the type of the netCDF variable is char, then an extra string dimension is added as the last dimension.

translation

For dimensions, the rules are as follows.

Fields in dimensioned structures inherit the dimension of the structure; thus the above list would have the following dimensioned variables.

        S1.FS2.f1 -> S1.FS2.f1[2][3]
        S1.FS2.f2 -> S1.FS2.f2[2]
        S2.G1.temp -> S2.G1.temp[lat=2][lon=2]
        S2.G1.lat -> S2.G1.lat[lat=2]
        S2.G1.lon -> S2.G1.lon[lon=2]
        S2.G2.G2 -> S2.G2.lon[lat=2][lon=2]
        S2.G2.lat -> S2.G2.lat[lat=2]
        S2.G2.lon -> S2.G2.lon[lon=2]
        lat -> lat[lat=2]
        lon -> lon[lon=2] 

Collect all of the dimension specifications from the DDS, both named and anonymous (unnamed) For each unique anonymous dimension with value NN create a netCDF dimension of the form "XX_<i>=NN", where XX is the fully qualified name of the variable and i is the i'th (inherited) dimension of the array where the anonymous dimension occurs. For our example, this would create the following dimensions.

        S1.FS2.f1_0 = 2 ;
        S1.FS2.f1_1 = 3 ;
        S1.FS2.f2_0 = 2 ;
        S2.G2.lat_0 = 2 ;
        S2.G2.lon_0 = 2 ; 

If however, the anonymous dimension is the single dimension of a MAP vector in a Grid then the dimension is given the same name as the map vector This leads to the following.

        S2.G2.lat_0 -> S2.G2.lat
        S2.G2.lon_0 -> S2.G2.lon 

For each unique named dimension "<name>=NN", create a netCDF dimension of the form "<name>=NN", where name has the qualifications removed. If this leads to duplicates (i.e. same name and same value), then the duplicates are ignored. This produces the following.

        S2.G2.lat -> lat
        S2.G2.lon -> lon 

Note that this produces duplicates that will be ignored later.

At this point the only dimensions left to process should be named dimensions with the same name as some dimension from step number 3, but with a different value. For those dimensions create a dimension of the form "<name>M=NN" where M is a counter starting at 1. The example has no instances of this.

Finally and if needed, define a single UNLIMITED dimension named "unlimited" with value zero. Unlimited will be used to handle certain kinds of DAP sequences (see below).

This leads to the following set of dimensions.

dimensions:
  unlimited = UNLIMITED;
  lat = 2 ;
  lon = 2 ;
  S1.FS2.f1_0 = 2 ;
  S1.FS2.f1_1 = 3 ;
  S1.FS2.f2_0 = 2 ;

Dimension Translation

The steps for variable name translation are as follows.

Take the set of variables captured above. Thus for the above DDS, the following fields would be collected.

        f1
        S1.f11
        S1.FS2.f1
        S1.FS2.f2
        S2.G1.temp
        S2.G2.G2
        lat
        lon 

All grid array variables are renamed to be the same as the containing grid and the grid prefix is removed. In the above DDS, this results in the following changes.

        G1.temp -> G1
        G2.G2 -> G2 

It is important to note that this process could produce duplicate variables (i.e. with the same name); in that case they are all assumed to have the same content and the duplicates are ignored. If it turns out that the duplicates have different content, then the translation will not detect this. YOU HAVE BEEN WARNED.

The final netCDF-3 schema (minus attributes) is then as follows.

netcdf t {
dimensions:
        unlimited = UNLIMITED ;
        lat = 2 ;
        lon = 2 ;
        S1.FS2.f1_0 = 2 ;
        S1.FS2.f1_1 = 3 ;
        S1.FS2.f2_0 = 2 ;
variables:
        int f1 ;
        int lat(lat) ;
        int lon(lon) ;
        int S1.f11 ;
        int S1.FS2.f1(S1.FS2.f1_0, S1.FS2.f1_1) ;
        int S1.FS2.f2(S1_FS2_f2_0) ;
        float S2.G1(lat, lon) ;
        float G2(lat, lon) ;
}

In actuality, the unlimited dimension is dropped because it is unused.

There are differences with the original libnc-dap here because libnc-dap technically was incorrect. The original would have said this, for example.

int S1.FS2.f1(lat, lat) ;

Note that this is incorrect because it dimensions S1.FS2.f1(2,2) rather than S1.FS2.f1(2,3).

DAP DDS Sequences

Any variable (as determined above) that is contained directly or indirectly by a Sequence is subject to revision of its rank using the following rules.

Let the variable be contained in Sequence Q1, where Q1 is the innermost containing sequence. If Q1 is itself contained (directly or indirectly) in a sequence, or Q1 is contained (again directly or indirectly) in a structure that has rank greater than 0, then the variable will have an initial UNLIMITED dimension. Further, all dimensions coming from "above" and including (in the containment sense) the innermost Sequence, Q1, will be removed and replaced by that single UNLIMITED dimension. The size associated with that UNLIMITED is zero, which means that its contents are inaccessible through the netCDF-3 API. Again, this differs from libnc-dap, which leaves out such variables. Again, however, this difference is backward compatible.

If the variable is contained in a single Sequence (i.e. not nested) and all containing structures have rank 0, then the variable will have an initial dimension whose size is the record count for that Sequence. The name of the new dimension will be the name of the Sequence.

Consider this example.

Dataset {
  Structure {
    Sequence {
      Int32 f1[3];
      Int32 f2;
    } SQ1;
  } S1[2]; 
  Sequence {
    Structure {
      Int32 x1[7];
    } S2[5];
  } Q2;
} D;

The corresponding netCDF-3 translation is pretty much as follows (the value for dimension Q2 may differ).

dimensions:
    unlimited = UNLIMITED ; // (0 currently)
    S1.SQ1.f1_0 = 2 ;
    S1.SQ1.f1_1 = 3 ;
    S1.SQ1.f2_0 = 2 ;
    Q2.S2.x1_0 = 5 ;
    Q2.S2.x1_1 = 7 ;
    Q2 = 5 ;
variables:
    int S1.SQ1.f1(unlimited, S1.SQ1.f1_1) ;
    int S1.SQ1.f2(unlimited) ;
    int Q2.S2.x1(Q2, Q2.S2.x1_0, Q2.S2.x1_1) ;

Note that for example S1.SQ1.f1_0 is not actually used because it has been folded into the unlimited dimension.

Note that for sequences without a leading unlimited dimension, there is a performance cost because the translation code has to walk the data to determine how many records are associated with the sequence. Since libnc-dap did essentially the same thing, it can be assumed that the cost is not prohibitive.

Translation Rules

A DAP to netCDF-4 translation also exists, but is not the default and in any case is only available if the "–enable-netcdf-4" option is specified at configure time. This translation includes some elements of the libnc-dap translation, but attempts to provide a simpler (but not, unfortunately, simple) set of translation rules than is used for the netCDF-3 translation. Please note that the translation is still experimental and will change to respond to unforeseen problems or to suggested improvements.

This text will use this running example.

Dataset {
  Int32 f1[fdim=10];
  Structure {
    Int32 f11;        
    Structure {
      Int32 f1[3];
      Int32 f2;
    } FS2[2]; 
  } S1; 
  Grid {
    Array:
      Float32 temp[lat=2][lon=2];
    Maps:
      Int32 lat[2];
      Int32 lon[2];
  } G1;
  Sequence {
    Float64 depth;
  } Q1;
} D
\code

\subsection Variable Definition

The rule for choosing variables is relatively simple. Start with the
names of the top-level fields of the DDS. The term top-level means
that the object is a direct subnode of the Dataset object. In our
example, this produces the set [f1, S1, G1, Q1].

\subsection Dimension Definition

The rules for choosing and defining dimensions is as follows.

Collect the set of dimensions (named and anonymous) directly
associated with the variables as defined above. This means that
dimensions within user-defined types are ignored. From our example,
the dimension set is [fdim=10,lat=2,lon=2,2,2]. Note that the
unqualified names are used.

All remaining anonymous dimensions are given the name "<var>_NN",
where "<var>" is the unqualified name of the variable in which the
anonymous dimension appears and NN is the relative position of that
dimension in the dimensions associated with that array. No instances
of this rule occur in the running example.

Remove duplicate dimensions (those with same name and value). Our
dimension set now becomes [fdim=10,lat=2,lon=2].

The final case occurs when there are dimensions with the same name but
with different values. For this case, the size of the dimension is
appended to the dimension name.

\subsection Type Definition

The rules for choosing user-defined types are as follows.

For every Structure, Grid, and Sequence, a netCDF-4 compound type is
created whose fields are the fields of the Structure, Sequence, or
Grid. With one exception, the name of the type is the same as the
Structure or Grid name suffixed with "_t". The exception is that the
compound types derived from Sequences are instead suffixed with
"_record_t".

The types of the fields are the types of the corresponding field of
the Structure, Sequence, or Grid. Note that this type might be itself
a user-defined type.

From the example, we get the following compound types.

\code
         compound FS2_t {
             int f1(3);
             int f2;
         };
         compound S1_t {
             int f11;
             FS2_t FS2(2);  
         };
         compound G1_t {
             float temp(2,2);
             int lat(2);
             int lon(2);
         }
         compound Q1_record_t {
             double depth;
         };

For all sequences of name X, also create this type.

             X_record_t (*) X_t

In our example, this produces the following type.

             Q1_record_t (*) Q1_t

If a Sequence, Q has a single field F, whose type is a primitive type, T, (e.g., int, float, string), then do not apply the previous rule, but instead replace the whole sequence with the the following field.

             T (*) Q.f

a Translation

The decision about whether to translate to netCDF-3 or netCDF-4 is determined by applying the following rules in order.

Caching

In an effort to provide better performance for some access patterns, client-side caching of data is available. The default is no caching, but it may be enabled by prefixing the URL with "[cache]".

Caching operates basically as follows.

When a URL is first accessed using nc_open(), netCDF automatically does a pre-fetch of selected variables. These include all variables smaller than a specified (and user definable) size. This allows, for example, quick access to coordinate variables.

Whenever a request is made using some variant of the nc_get_var() API procedures, the complete variable is fetched and stored in the cache as a new cache entry. Subsequence requests for any part of that variable will access the cache entry to obtain the data.

The cache may become too full, either because there are too many entries or because it is taking up too much disk space. In this case cache entries are purged until the cache size limits are reached. The cache purge algorithm is LRU (least recently used) so that variables that are repeatedly referenced will tend to stay in the cache.

The cache is completely purged when nc_close() is invoked.

In order to decide if you should enable caching, you will need to have some understanding of the access patterns of your program.

The ncdump program always dumps one or more whole variables so it turns on caching.

If your program accesses only parts of a number of variables, then caching should probably not be used since fetching whole variables will probably slow down your program for no purpose.

Unfortunately, caching is currently an all or nothing proposition, so for more complex access patterns, the decision to cache or not may not have an obvious answer. Probably a good rule of thumb is to avoid caching initially and later turn it on to see its effect on performance.

Client Parameters

Currently, a limited set of client parameters is recognized. Parameters not listed here are ignored, but no error is signalled.

Parameter Name Legal Values Semantics

on Debugging OPeNDAP Access

The OPeNDAP support makes use of the logging facility of the underlying oc system. Note that this is currently separate from the existing netCDF logging facility. Turning on this logging can sometimes give important information. Logging can be enabled by prefixing the url with the client parameter [log] or [log=filename], where the first case will send log output to standard error and the second will send log output to the specified file.

Users should also be aware that the DAP subsystem creates temporary files of the name dataddsXXXXXX, where XXXXX is some random string. If the program using the DAP subsystem crashes, these files may be left around. It is perfectly safe to delete them. Also, if you are accessing data over an NFS mount, you may see some .nfsxxxxx files; those can be ignored as well. 4.12.4 HTTP Configuration.

Limited support for configuring the http connection is provided via parameters in the “.httprc” configuration file. Although deprecated, the name “.dodsrc” may also be used. The relevant .httprc file is located by first looking in the current working directory, and if not found, then looking in the directory specified by the “$HOME” environment variable.

Entries in the .httprc file are of the form:

     ['['<url>']']<key>=<value>

That is, it consists of a key name and value pair and optionally preceded by a url enclosed in square brackets.

For given KEY and URL strings, the value chosen is as follows:

If URL is null, then look for the .dodsrc entry that has no url prefix and whose key is same as the KEY for which we are looking.

If the URL is not null, then look for all the .dodsrc entries that have a url, URL1, say, and for which URL1 is a prefix (in the string sense) of URL. For example, if URL = http//x.y/a, then it will match entries of the form

              1. [http//x.y/a]KEY=VALUE
              2. [http//x.y/a/b]KEY=VALUE

It will not match an entry of the form

              [http//x.y/b]KEY=VALUE

because “http://x.y/b” is not a string prefix of “http://x.y/a”. Finally from the set so constructed, choose the entry with the longest url prefix: “http//x.y/a/b]KEY=VALUE” in this case.

Currently, the supported set of keys (with descriptions) are as follows.

    HTTP.VERBOSE
        Type: boolean ("1"/"0")
        Description: Produce verbose output, especially using SSL.
        Related CURL Flags: CURLOPT_VERBOSE 
    HTTP.DEFLATE
        Type: boolean ("1"/"0")
        Description: Allow use of compression by the server.
        Related CURL Flags: CURLOPT_ENCODING 
    HTTP.COOKIEJAR
        Type: String representing file path
        Description: Specify the name of file into which to store cookies. Defaults to in-memory storage.
        Related CURL Flags:CURLOPT_COOKIEJAR 
    HTTP.COOKIEFILE
        Type: String representing file path
        Description: Same as HTTP.COOKIEJAR.
        Related CURL Flags: CURLOPT_COOKIEFILE 
    HTTP.CREDENTIALS.USER
        Type: String representing user name
        Description: Specify the user name for Digest and Basic authentication.
        Related CURL Flags: 
    HTTP.CREDENTIALS.PASSWORD
        Type: String representing password
        Type: boolean ("1"/"0")
        Description: Specify the password for Digest and Basic authentication.
        Related CURL Flags: 
    HTTP.SSL.CERTIFICATE
        Type: String representing file path
        Description: Path to a file containing a PEM cerficate.
        Related CURL Flags: CURLOPT_CERT 
    HTTP.SSL.KEY
        Type: String representing file path
        Description: Same as HTTP.SSL.CERTIFICATE, and should usually have the same value.
        Related CURL Flags: CURLOPT_SSLKEY 
    HTTP.SSL.KEYPASSWORD
        Type: String representing password
        Description: Password for accessing the HTTP.SSL.KEY/HTTP.SSL.CERTIFICATE
        Related CURL Flags: CURLOPT_KEYPASSWORD 
    HTTP.SSL.CAPATH
        Type: String representing directory
        Description: Path to a directory containing trusted certificates for validating server sertificates.
        Related CURL Flags: CURLOPT_CAPATH 
    HTTP.SSL.VALIDATE
        Type: boolean ("1"/"0")
        Description: Cause the client to verify the server's presented certificate.
        Related CURL Flags: CURLOPT_SSL_VERIFYPEER, CURLOPT_SSL_VERIFYHOST 
    HTTP.TIMEOUT
        Type: String ("dddddd")
        Description: Specify the maximum time in seconds that you allow the http transfer operation to take.
        Related CURL Flags: CURLOPT_TIMEOUT, CURLOPT_NOSIGNAL 
    HTTP.PROXY_SERVER
        Type: String representing url to access the proxy: (e.g.http://[username:password@]host[:port])
        Description: Specify the needed information for accessing a proxy.
        Related CURL Flags: CURLOPT_PROXY, CURLOPT_PROXYHOST, CURLOPT_PROXYUSERPWD 

The related curl flags line indicates the curl flags modified by this key. See the libcurl documentation of the curl_easy_setopt() function for more detail http://curl.haxx.se/libcurl/c/curl_easy_setopt.html.

For ESG, the following entries must be specified:

    HTTP.SSL.VALIDATE
    HTTP.COOKIEJAR
    HTTP.SSL.CERTIFICATE
    HTTP.SSL.KEY
    HTTP.SSL.CAPATH 

Additionally, for ESG, the HTTP.SSL.CERTIFICATE and HTTP.SSL.KEY entries should have same value, which is the file path for the certificate produced by MyProxyLogon. The HTTP.SSL.CAPATH entry should be the path to the "certificates" directory produced by MyProxyLogon.

 All Data Structures Files Functions Variables Typedefs Defines

Generated on Tue Aug 6 2013 11:40:57 for netCDF. NetCDF is a Unidata library.